Copula Mixture Model for Dependency-seeking Clustering Melanie Rey, - - PowerPoint PPT Presentation

copula mixture model for dependency seeking clustering
SMART_READER_LITE
LIVE PREVIEW

Copula Mixture Model for Dependency-seeking Clustering Melanie Rey, - - PowerPoint PPT Presentation

Copula Mixture Model for Dependency-seeking Clustering Melanie Rey, Volker Roth Department of Mathematics and Computer Science, University of Basel, Switzerland December 16, 2011 1 / 1 Dependency-seeking Clustering Clustering co-occurring


slide-1
SLIDE 1

Copula Mixture Model for Dependency-seeking Clustering

Melanie Rey, Volker Roth

Department of Mathematics and Computer Science, University of Basel, Switzerland

December 16, 2011

1 / 1

slide-2
SLIDE 2

Dependency-seeking Clustering

◮ Clustering co-occurring samples

from different data sources called views:

◮ The aim is to cluster the points

according to their between-views dependence structure.

2 / 1

slide-3
SLIDE 3

Dependency-seeking Clustering

(1) + (2) : cor(X2, Y2) = 0.45.

  • −6

−2 2 4 6 −5 5

view 1

X1 X2

  • −6

−2 2 4 6 −5 5

view 2

Y1 Y2

3 / 1

slide-4
SLIDE 4

Dependency-seeking Clustering

(1) + (2) : cor(X2, Y2) = 0.45.

  • −6

−2 2 4 6 −5 5

view 1

X1 X2

  • −6

−2 2 4 6 −5 5

view 2

Y1 Y2

(1) : cor(X2, Y2) = 0.8, (2) : cor(X1, Y1) = 0.45.

  • −6

−2 2 4 6 −5 5

view 1

X1 X2

  • −6

−2 2 4 6 −5 5

view 2

Y1 Y2

4 / 1

slide-5
SLIDE 5

Probabilistic CCA

◮ The probabilistic interpretation of CCA [Bach, 2005]:

Z ∼ Nd (0, Id), (X, Y ) |Z ∼ Np+q (WZ + µ, Ψ) , where Ψ has a block-diagonal form: Ψ = Ψx Ψy

  • .

5 / 1

slide-6
SLIDE 6

Dependency-seeking Clustering

◮ Probabilistic dependency-seeking clustering [Klami, 2006]:

Z ∼ Mult (θ), (X, Y ) |Z ∼ Np+q (µz, Ψ) . where Ψ has a block-diagonal form: Ψ = Ψx Ψy

  • .

◮ Ψ block diagonal −

→ independent views conditioned on cluster assignment − → cluster structure captures dependencies.

6 / 1

slide-7
SLIDE 7

Clustering of non-Gaussian data

  • −4

−2 2 4 −6 −4 −2 2 4

Gaussian model, view 1

X1 X2

  • 0.0

0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

Gaussian model, view 2

Y1 Y2

  • −4

−2 2 4 −4 −2 2 4

Copula model, view 1

X1 X2

  • 0.0

0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

Copula model, view 2

Y1 Y2 −6 −4 −2 2 4 6 0.0 0.1 0.2 0.3 0.4 0.0 0.2 0.4 0.6 0.8 1.0 1 2 3 4

7 / 1

slide-8
SLIDE 8

Meta-Gaussian distribution

◮ Specify dependence by a Gaussian copula with block-diagonal

correlation matrix P: C G

P (u) = ΦP

  • Φ−1

u1 , . . . , Φ−1 ud .

◮ Margins are arbitrary continuous distributions:

X j|θ = X j|θj ∼ F j

X|θ, j = 1, . . . , p,

Y j|θ = Y j|θj ∼ F j

Y |θ, j = 1, . . . , q. ◮ Use Sklar’s Theorem to construct Fθ,P.

8 / 1

slide-9
SLIDE 9

Meta-Gaussian density

◮ Consider F with copula C and margins F 1, . . . , F d. If F has a

density then it can be expressed as: f (x1, . . . , xd) = c

  • F 1(x1), . . . , F d(xd)

d

  • j=1

f j(xj), where c(u1, . . . , ud) = ∂C(u1,...,ud)

∂u1...∂ud

is the copula density of C.

◮ Gaussian copula density has a simple form and fθ,P is:

f(X,Y )|θ,P(x, y) = |P|− 1

2 exp

  • −1

2˜ xT(P−1 − I)˜ x p+q

  • j=1

f j(xj), where ˜ xj = Φ−1 F j(xj)

  • .

9 / 1

slide-10
SLIDE 10

Mixture of Copula Model

The joint density of X and Y is a Dirichlet process prior mixture: f(X,Y )(x, y) = f(X,Y )|θ,P(x, y)dµθ,PdµG(λ, G0).

10 / 1

slide-11
SLIDE 11

The priors

◮ Assume a priori independence for θ and P:

→ specify the priors separately

◮ Specify prior distributions for Px and Py, where

P = Px Py

  • , assuming a priori independence.

◮ For Px and Py we choose the marginally uniform prior

[Barnard, 2000]: f (R, d + 1) ∝ |R|

d(d−1) 2

−1

d

  • i=1

|Rii| − (d+1)

2

.

11 / 1

slide-12
SLIDE 12

Inference

◮ MCMC algorithm for DP with

non-conjugate prior [Neal, 1998].

◮ Simplifies when using data

augmentation: introduce the normal scores ( ˜ X, ˜ Y ) ˜ X j = Φ−1 F j(X j)

  • ,

˜ Y j = Φ−1 F j(Y j)

  • .

We then have: ( ˜ X, ˜ Y ) ∼ Np+q(0, P)

12 / 1

slide-13
SLIDE 13

Simulations

  • 0.0

0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

Gaussian model, view 2

Y1 Y2

First margin, view 2

Y1 Density 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 Gaussian clusters sum of Gaussian

  • 0.0

0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

Copula model, view 2

Y1 Y2

First margin, view 2

Y1 Density 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 Beta clusters sum of Beta

  • 0.0

0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

CCA Mixture, view 2

Y1 Y2

13 / 1

slide-14
SLIDE 14

Real data experiments

Two data sets containing information about the regulation of heat shock in yeast, [Gasch, 2000], [Harbison, 2004].

◮ First view : gene expressions for yeast measured at 4 time

points → Gaussian

◮ Second view: probability scores of binding interactions for 8

different regulators → Beta

14 / 1

slide-15
SLIDE 15

Real data experiments

Gaussian Mixture, view 2

Y6 Density 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 1.2

Copula mixture, view 2

Y6 Density 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 1.2

15 / 1

slide-16
SLIDE 16

Conclusion

◮ Dependency-seeking clustering as alternative to CCA for

multi-view analysis.

◮ Gaussian model produces misleading results when Gaussian

assumption violated.

◮ Increase flexibility using a copula mixture model. ◮ Thank you !

melanie.rey@unibas.ch

16 / 1