An eigenvalue optimization problem for graph partitioning Chris - - PowerPoint PPT Presentation

an eigenvalue optimization problem for graph partitioning
SMART_READER_LITE
LIVE PREVIEW

An eigenvalue optimization problem for graph partitioning Chris - - PowerPoint PPT Presentation

An eigenvalue optimization problem for graph partitioning Chris White UT-Austin February 5, 2014 Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 1 / 28 Outline Outline Quick introduction to clustering &


slide-1
SLIDE 1

An eigenvalue optimization problem for graph partitioning

Chris White

UT-Austin

February 5, 2014

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 1 / 28

slide-2
SLIDE 2

Outline

Outline

Quick introduction to clustering & graph partitioning Previous related work The Dirichlet Energy

Definitions A Relaxation A Rearrangement Algorithm

Connections

Nonnegative Matrix Factorization Reaction-Diffusion Equations

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 2 / 28

slide-3
SLIDE 3

Intro to Clustering

Cluster Analysis

Cluster Analysis seeks to find meaningful groups within data, by

  • ptimizing some measure of

similarity or dissimilarity. Example Clustering

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 3 / 28

slide-4
SLIDE 4

Intro to Clustering

Graph Partitioning

In the graph partitioning framework, one forms a graph where the nodes represent the observed data points and the edge weights represent some measure of similarity, with the goal of utilizing geometric tools and insights to analyze the data.

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 4 / 28

slide-5
SLIDE 5

Intro to Clustering

Challenges

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 5 / 28

slide-6
SLIDE 6

Intro to Clustering

Challenges

Large, high dimensional data sets

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 5 / 28

slide-7
SLIDE 7

Intro to Clustering

Challenges

Large, high dimensional data sets Typical formulations of graph problems lead to NP-hard problems

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 5 / 28

slide-8
SLIDE 8

Intro to Clustering

Challenges

Large, high dimensional data sets Typical formulations of graph problems lead to NP-hard problems Measure of optimality for clustering can be application dependent

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 5 / 28

slide-9
SLIDE 9

Intro to Clustering

Some Notation

The graph Laplacian(s): ∆G,r := D1−r − D−r/2WD−r/2 The gradient of a function f : V → R: ∇f (v, w) := f (v) − f (w) The inner product: f , gV ,r :=

  • i∈V

dr

i f (i)g(i)

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 6 / 28

slide-10
SLIDE 10

Some Previous Work

Previous Work

The Cheeger cut (or balanced cut) of a graph (V , E) is the following quantity: h(G) := min

S⊂V

|∂S| min {vol(S), vol(Sc)} where |∂S| :=

  • i∈S,j /

∈S

wij is the perimeter of the vertex set S and vol(S) =

  • i∈S

di.

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 7 / 28

slide-11
SLIDE 11

Some Previous Work

Previous Work

The Cheeger cut (or balanced cut) of a graph (V , E) is the following quantity: h(G) := min

S⊂V

|∂S| min {vol(S), vol(Sc)} where |∂S| :=

  • i∈S,j /

∈S

wij is the perimeter of the vertex set S and vol(S) =

  • i∈S

di. Provides a geometrically meaningful bi-partition of the graph NP-hard to compute

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 7 / 28

slide-12
SLIDE 12

Some Previous Work

Some Previous Work

[Bresson et al. (2013)] attempt to solve min

R

  • r=1

|∂Ar| min{λ|Ar|, |Ac

r |}.

Relaxation: min

R

  • r=1

frTV fr − medλ(fr)1,λ subject to fi : V → [0, 1],

R

  • r=1

fr = 1.

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 8 / 28

slide-13
SLIDE 13

Some Previous Work

Some Previous Work

Using ideas from materials science, [Bertozzi and Flenner (2012)] introduce the following graph-based Ginzburg-Landau functional Ginzburg-Landau functional E(u) := u, ∆u + 1 2ǫ

  • v∈V

(u2(v) − 1)2 + F(u, u0).

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 9 / 28

slide-14
SLIDE 14

Some Previous Work

Some Previous Work

Using ideas from materials science, [Bertozzi and Flenner (2012)] introduce the following graph-based Ginzburg-Landau functional Ginzburg-Landau functional E(u) := u, ∆u + 1 2ǫ

  • v∈V

(u2(v) − 1)2 + F(u, u0). Using numerical methods for mean curvature flow [MBO], in [Merkurjev et al. (2012)] a fast spectral-based algorithm was developed for finding minima.

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 9 / 28

slide-15
SLIDE 15

Some Previous Work

Some Previous Work

Using ideas from materials science, [Bertozzi and Flenner (2012)] introduce the following graph-based Ginzburg-Landau functional Ginzburg-Landau functional E(u) := u, ∆u + 1 2ǫ

  • v∈V

(u2(v) − 1)2 + F(u, u0). Using numerical methods for mean curvature flow [MBO], in [Merkurjev et al. (2012)] a fast spectral-based algorithm was developed for finding minima. This has inspired interesting work on analogues of mean curvature on graphs [van Gennip et al. (2013)].

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 9 / 28

slide-16
SLIDE 16

A New Approach

We will now describe a new approach to graph partitioning based on Dirichlet eigenvalues, inspired by the analogous continuous problem.

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 10 / 28

slide-17
SLIDE 17

A New Approach

We will now describe a new approach to graph partitioning based on Dirichlet eigenvalues, inspired by the analogous continuous problem. Advantages: Easy to implement algorithm, with convergence and local optimality guarantees

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 10 / 28

slide-18
SLIDE 18

A New Approach

We will now describe a new approach to graph partitioning based on Dirichlet eigenvalues, inspired by the analogous continuous problem. Advantages: Easy to implement algorithm, with convergence and local optimality guarantees Representatives for each cluster

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 10 / 28

slide-19
SLIDE 19

A New Approach

We will now describe a new approach to graph partitioning based on Dirichlet eigenvalues, inspired by the analogous continuous problem. Advantages: Easy to implement algorithm, with convergence and local optimality guarantees Representatives for each cluster Interesting PageRank interpretation

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 10 / 28

slide-20
SLIDE 20

A New Approach

We will now describe a new approach to graph partitioning based on Dirichlet eigenvalues, inspired by the analogous continuous problem. Advantages: Easy to implement algorithm, with convergence and local optimality guarantees Representatives for each cluster Interesting PageRank interpretation Relationship to other areas: geometric domain decomposition, reaction-diffusion equations This is joint work with Braxton Osting and ´ Edouard Oudet [Osting et al. (2013)].

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 10 / 28

slide-21
SLIDE 21

A New Approach

Dirichlet Eigenvalues

Recall that for a domain Ω ⊂ Rn, the Dirichlet Eigenvalue λ1(Ω) is defined to be the smallest number λ for which there exists a solution to the following Dirichlet problem: ∆ψ = λψ in Ω ψ = 0

  • n ∂Ω

Equivalently, λ1(Ω) = inf

ψ=0 ψ|∂Ω=0

∇ψ2

2

ψ2

2

.

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 11 / 28

slide-22
SLIDE 22

A New Approach

Dirichlet Energy

Analogously, for a vertex subset S ⊂ V we define λ1(S) = min

ψ=0 ψ|Sc =0

∇ψ2

2

ψ2

2

. (1) λ1(S) is a Dirichlet eigenvalue for ∆G, and Perron-Frobenius theory tells us that the associated eigenvector can be taken to be strictly positive inside S.

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 12 / 28

slide-23
SLIDE 23

A New Approach

Dirichlet Energy

On a graph, we have the following inequalities: Gershgorin Circle Theorem min

i∈S [di −

  • j∈S

wij] ≤ λ1(S) ≤ |∂S| vol(S)

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 13 / 28

slide-24
SLIDE 24

A New Approach

Dirichlet Energy

On a graph, we have the following inequalities: Gershgorin Circle Theorem min

i∈S [di −

  • j∈S

wij] ≤ λ1(S) ≤ |∂S| vol(S) Local Cheeger Inequality [Chung (2007)] hS ≤ λ1(S) ≤ h2

S

2

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 13 / 28

slide-25
SLIDE 25

A New Approach

Dirichlet Energy

On a graph, we have the following inequalities: Gershgorin Circle Theorem min

i∈S [di −

  • j∈S

wij] ≤ λ1(S) ≤ |∂S| vol(S) Local Cheeger Inequality [Chung (2007)] hS ≤ λ1(S) ≤ h2

S

2 Thus we seek to minimize the Dirichlet energy of a k-partition {Vi}k

i=1

min

V =∐k

i=1Vi

k

  • i=1

λ1(Vi). (2)

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 13 / 28

slide-26
SLIDE 26

A New Approach

Dirichlet Energy : Relaxation

For a vertex function φ : V → [0, 1] and α > 0, consider the quantity λα(φ) := min

ψ2=1 ∇ψ2 + αψ2 (1−φ)

and the relaxed Dirichlet energy Λα,∗

k

:= min

{φi}k

i=1∈Ak

k

  • i=1

λα(φi) (3) where Ak :=

  • {φi}k

i=1 : φi : V → [0, 1] and k i=1 φi = 1

  • .

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 14 / 28

slide-27
SLIDE 27

A New Approach

Dirichlet Energy : Relaxation

  • Theorem. [Osting, White, Oudet (2013)] For S ⊂ V ,

lim

α→∞ λα(χS) = λ(S) and lim α→∞ ψα(χS) = ψD(S),

where ψD(S) is the Dirichlet eigenvector achieving the infimum in (1).

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 15 / 28

slide-28
SLIDE 28

A New Approach

Dirichlet Energy : Relaxation

  • Theorem. [Osting, White, Oudet (2013)] For S ⊂ V ,

lim

α→∞ λα(χS) = λ(S) and lim α→∞ ψα(χS) = ψD(S),

where ψD(S) is the Dirichlet eigenvector achieving the infimum in (1).

  • Theorem. [OWO (2013)] Let k ∈ Z+ and α > 0. Every (local) minimizer
  • f Λα,∗

k

  • ver Ak is a collection of indicator functions.

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 15 / 28

slide-29
SLIDE 29

A New Approach

Dirichlet Energy : Algorithm

  • Lemma. λα(φ) is a concave function of φ.

Thus the relaxed objective is a non-convex problem.

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 16 / 28

slide-30
SLIDE 30

A New Approach

Dirichlet Energy : Algorithm

  • Lemma. λα(φ) is a concave function of φ.

Thus the relaxed objective is a non-convex problem. Rearrangement Algorithm Input An initial {φi}k

i=1 ∈ Ak.

while not converged, do For i = 1, . . . , k, compute the (positive and normalized) eigenfunction ψi corresponding to λα(φi). Assign each node v ∈ V the label i = arg maxj ψj(v). Let {φi}k

i=1 be the indicator functions for the labels.

end while

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 16 / 28

slide-31
SLIDE 31

A New Approach

Dirichlet Energy : Algorithm

Theorem.[OWO (2013)] Fix α > 0. For any initialization, the rearrangement algorithm terminates in a finite number of steps at a local minimum of Λα

k .

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 17 / 28

slide-32
SLIDE 32

Numerical Experiments

Numerical Experiments

Five Moons

−20 20 40 60 80 100 −10 −5 5 10 15

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 18 / 28

slide-33
SLIDE 33

Numerical Experiments

MNIST Handwritten Digits, n=70,000

Cluster Means Cluster Representatives

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 19 / 28

slide-34
SLIDE 34

Connections

Connections

Connections to other areas / problems: Geometric Domain Decomposition Nonnegative Matrix Factorization Reaction-Diffusion Equations

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 20 / 28

slide-35
SLIDE 35

Connections

Geometric Domain Decomposition

Our objective was inspired by the analogous continuous problem; the rearrangement algorithm works well on discretizations of manifolds, as seen below:

Torus, k=25

−0.5 0.5 −0.5 0.5 −0.5 0.5

Sphere, k=3

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 21 / 28

slide-36
SLIDE 36

Connections

NMF

Nonnegative Matrix Factorization (NMF) is another approach to clustering, which attempts to solve min

U∈M S − UUT2 F

for a given similarity matrix S, where M :=

  • U ∈ Rn×k : UTU = Idk, Uij ≥ 0
  • .

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 22 / 28

slide-37
SLIDE 37

Connections

NMF

  • Proposition. [OWO (2013)] Let Ψ∗ :=
  • ψD

1 | · · · |ψD k

  • be the Dirichlet

e-vectors corresponding to the optimal partition. Then Ψ∗ satisfies: D−1/2Ψ∗ = arg min

U∈M WD−1 − UUT2 F,

where M := {U ∈ Rn×k : UTU = Id, Uij ≥ 0}.

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 23 / 28

slide-38
SLIDE 38

Connections

NMF

  • Proposition. [OWO (2013)] Let Ψ∗ :=
  • ψD

1 | · · · |ψD k

  • be the Dirichlet

e-vectors corresponding to the optimal partition. Then Ψ∗ satisfies: D−1/2Ψ∗ = arg min

U∈M WD−1 − UUT2 F,

where M := {U ∈ Rn×k : UTU = Id, Uij ≥ 0}. Observe that WD−1 is the random walk transition matrix on the graph.

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 23 / 28

slide-39
SLIDE 39

Connections

Reaction-Diffusion Equations

Brownian motion of many particles with k particle “species”. Stationary states of process

  • FIG. 4. Color online The stationary states of

components

Minimization of the Renyi entropy production in the space-partitioning process, (2005) O. Cybulski, V. Babin, and R. Holyst.

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 24 / 28

slide-40
SLIDE 40

Connections

Reaction-Diffusion Equations

Brownian motion of many particles with k particle “species”. When 2 particles of different species meet, they annihilate each another. Stationary states of process

  • FIG. 4. Color online The stationary states of

components

Minimization of the Renyi entropy production in the space-partitioning process, (2005) O. Cybulski, V. Babin, and R. Holyst.

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 24 / 28

slide-41
SLIDE 41

Connections

Reaction-Diffusion Equations

Brownian motion of many particles with k particle “species”. When 2 particles of different species meet, they annihilate each another. When a particle reaches the boundary, it is annihilated. Stationary states of process

  • FIG. 4. Color online The stationary states of

components

Minimization of the Renyi entropy production in the space-partitioning process, (2005) O. Cybulski, V. Babin, and R. Holyst.

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 24 / 28

slide-42
SLIDE 42

Connections

Reaction-Diffusion Equations

Brownian motion of many particles with k particle “species”. When 2 particles of different species meet, they annihilate each another. When a particle reaches the boundary, it is annihilated. When a particle is annihilated, another particle of the same species is chosen uniformly at random and duplicated so that the number of particles of each species remains constant. Stationary states of process

  • FIG. 4. Color online The stationary states of

components

Minimization of the Renyi entropy production in the space-partitioning process, (2005) O. Cybulski, V. Babin, and R. Holyst.

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 24 / 28

slide-43
SLIDE 43

Connections

Reaction-Diffusion Equations

Attempt at modeling this system: Osting & White (2013) d dt pi = −(∆ + κVi)pi + κ npi, Vi1, i = 1, . . . , k, (4) where Vi =

j=i p2 j is a nonlinear potential, and κ > 0 is an interaction

parameter.

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 25 / 28

slide-44
SLIDE 44

Connections

Reaction-Diffusion Equations

Attempt at modeling this system: Osting & White (2013) d dt pi = −(∆ + κVi)pi + κ npi, Vi1, i = 1, . . . , k, (4) where Vi =

j=i p2 j is a nonlinear potential, and κ > 0 is an interaction

parameter. This is the ℓ2 gradient flow of the energy E[p] = 1 2

  • i

pi, ∆pi + κ 4

  • i=j

p2

i , p2 j

subject to the constraints that pi, 1 = 1 for all i ∈ [k].

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 25 / 28

slide-45
SLIDE 45

Connections

Reaction-Diffusion Equations

Conjecture: We conjecture that as κ → ∞, the stationary states of (4) are equivalent to the Dirichlet eigenfunctions, attaining (2). If true, this exposes a new avenue for algorithm development. In particular, if stationary states of (4) can be efficiently found, then to each state we assign the class label corresponding to the species dominating there.

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 26 / 28

slide-46
SLIDE 46

References

References

  • A. L. Bertozzi and A. Flenner. Diffuse interface models on graphs for

classification of high dimensional data. Multiscale Modeling & Simulation, 10(3):1090–1118, 2012.

  • X. Bresson, T. Laurent, D. Uminsky, and J. H. von Brecht. Multiclass

total variation clustering, 2013.

  • F. Chung. Random walks and local cuts in graphs. Linear Algebra and its

applications, 423(1):22–32, 2007.

  • E. Merkurjev, T. Kostic, and A. Bertozzi. An MBO scheme on graphs for

segmentation and image processing. UCLA CAM Report 12–46, 2012.

  • B. Osting, C. D. White, and E. Oudet. Minimal dirichlet energy partitions

for graphs. http://arxiv.org/abs/1308.4915, 2013.

  • Y. van Gennip, N. Guillen, B. Osting, and A. Bertozzi. Mean curvature,

threshold dynamics, and phase field theory on finite graphs. submitted, 2013.

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 27 / 28

slide-47
SLIDE 47

Thank you!

Thank you!

Thank you!

Chris White (UT-Austin) An eigenvalue optimization problem February 5, 2014 28 / 28