deep gaussian mixture models
play

Deep Gaussian Mixture Models Cinzia Viroli (University of Bologna, - PowerPoint PPT Presentation

Deep Gaussian Mixture Models Cinzia Viroli (University of Bologna, Italy) joint with Geoff McLachlan (University of Queensland, Australia) JOCLAD 2018, Lisbona, April 5th, 2018 Outline Deep Learning Mixture Models Deep Gaussian Mixture


  1. Deep Gaussian Mixture Models Cinzia Viroli (University of Bologna, Italy) joint with Geoff McLachlan (University of Queensland, Australia) JOCLAD 2018, Lisbona, April 5th, 2018

  2. Outline Deep Learning Mixture Models Deep Gaussian Mixture Models ECDA 2017 Deep GMM 2

  3. Deep Learning Deep Learning ECDA 2017 Deep GMM 3

  4. Deep Learning Deep Learning Deep Learning is a trendy topic in the machine learning community ECDA 2017 Deep GMM 4

  5. Deep Learning What is Deep Learning? Deep Learning is a set of algorithms in machine learning able to gradually learning a huge number of parameters in an architecture composed by multiple non linear transformations (multi-layer structure) ECDA 2017 Deep GMM 5

  6. Deep Learning Example of Learning ECDA 2017 Deep GMM 6

  7. Deep Learning Example of Deep Learning ECDA 2017 Deep GMM 7

  8. Deep Learning Facebook’s DeepFace DeepFace (Yaniv Taigman) is a deep learning facial recognition system that employs a nine-layer neural network with over 120 million connection weights. It identifies human faces in digital images with an accuracy of 97 . 35%. ECDA 2017 Deep GMM 8

  9. Mixture Models Mixture Models ECDA 2017 Deep GMM 9

  10. Mixture Models Gaussian Mixture Models (GMM) In model based clustering data are assumed to come from a finite mixture model (McLachlan and Peel, 2000; Fraley and Raftery, 2002). ECDA 2017 Deep GMM 10

  11. Mixture Models Gaussian Mixture Models (GMM) In model based clustering data are assumed to come from a finite mixture model (McLachlan and Peel, 2000; Fraley and Raftery, 2002). For quantitative data each mixture component is usually modeled as a multivariate Gaussian distribution: k � π j φ ( p ) ( y ; µ j , Σ j ) f ( y ; θ ) = j =1 ECDA 2017 Deep GMM 10

  12. Mixture Models Gaussian Mixture Models (GMM) In model based clustering data are assumed to come from a finite mixture model (McLachlan and Peel, 2000; Fraley and Raftery, 2002). For quantitative data each mixture component is usually modeled as a multivariate Gaussian distribution: k � π j φ ( p ) ( y ; µ j , Σ j ) f ( y ; θ ) = j =1 Growing popularity, widely used. ECDA 2017 Deep GMM 10

  13. Mixture Models Gaussian Mixture Models (GMM) However, in the recent years, a lot of research has been done to address two issues: High-dimensional data: when the number of observed variables is large, it is well known that GMM represents an over-parameterized solution ECDA 2017 Deep GMM 11

  14. Mixture Models Gaussian Mixture Models (GMM) However, in the recent years, a lot of research has been done to address two issues: High-dimensional data: when the number of observed variables is large, it is well known that GMM represents an over-parameterized solution Non-Gaussian data: when data are not Gaussian, GMM could requires more components than true clusters thus requiring merging or alternative distributions. ECDA 2017 Deep GMM 11

  15. Mixture Models High dimensional data Some solutions (among the others): Dimensionally reduced model Model based clustering based clustering ECDA 2017 Deep GMM 12

  16. Mixture Models High dimensional data Some solutions (among the others): Dimensionally reduced model Model based clustering based clustering Banfield and Raftery (1993) and Celeux and Govaert (1995): proposed constrained GMM based on parameterization of the generic component-covariance matrix based on its spectral decomposition: Σ i = λ i A ⊤ i D i A i Bouveyron et al. (2007): proposed a different parameterization of the generic component-covariance matrix ECDA 2017 Deep GMM 12

  17. Mixture Models High dimensional data Some solutions (among the others): Dimensionally reduced model Model based clustering based clustering Ghahrami and Hilton (1997) and Banfield and Raftery (1993) and McLachlan et al. (2003): Celeux and Govaert (1995): Mixtures of Factor Analyzers proposed constrained GMM based (MFA) on parameterization of the generic Yoshida et al. (2004), Baek and component-covariance matrix based McLachlan (2008), Montanari and on its spectral decomposition: Σ i = λ i A ⊤ Viroli (2010) : i D i A i Factor Mixture Analysis (FMA) or Bouveyron et al. (2007): Common MFA proposed a different McNicolas and Murphy (2008): parameterization of the generic eight paraterizations of the component-covariance matrix covariance matrices in MFA ECDA 2017 Deep GMM 12

  18. Mixture Models Non-Gaussian data Some solutions (among the others): More components than clusters Non-Gaussian distributions ECDA 2017 Deep GMM 13

  19. Mixture Models Non-Gaussian data Some solutions (among the others): More components than clusters Non-Gaussian distributions Merging mixture components (Hennig, 2010; Baudry et al., 2010; Melnykov, 2016) Mixtures of mixtures models (Li, 2005) and in the dimensional reduced space mixtures of MFA (Viroli, 2010) ECDA 2017 Deep GMM 13

  20. Mixture Models Non-Gaussian data Some solutions (among the others): More components than clusters Non-Gaussian distributions Mixtures of skew-normal, skew-t and canonical fundamental skew distributions (Lin, 2009; Lee and McLachlan, 2011-2017) Merging mixture components (Hennig, 2010; Baudry et al., 2010; Mixtures of generalized hyperbolic Melnykov, 2016) distributions (Subedi and McNicholas, 2014; Franczak et al., Mixtures of mixtures models (Li, 2014) 2005) and in the dimensional reduced space mixtures of MFA MFA with non-Normal distributions (Viroli, 2010) (McLachlan et al. 2007; Andrews and McNicholas, 2011; and many recent proposals by McNicholas, McLachlan and colleagues) ECDA 2017 Deep GMM 13

  21. Deep Gaussian Mixture Models Deep Gaussian Mixture Models ECDA 2017 Deep GMM 14

  22. Deep Gaussian Mixture Models Why Deep Mixtures? A Deep Gaussian Mixture Model (DGMM) is a network of multiple layers of latent variables, where, at each layer, the variables follow a mixture of Gaussian distributions. ECDA 2017 Deep GMM 15

  23. Deep Gaussian Mixture Models Gaussian Mixtures vs Deep Gaussian Mixtures Given data y , of dimension n × p , the mixture model k 1 � π j φ ( p ) ( y ; µ j , Σ j ) f ( y ; θ ) = j =1 can be rewritten as a linear model with a certain prior probability: y = µ j + Λ j z + u with probab π j where z ∼ N (0 , I p ) u is an independent specific random errors with u ∼ N (0 , Ψ j ) Σ j = Λ j Λ ⊤ j + Ψ j ECDA 2017 Deep GMM 16

  24. Deep Gaussian Mixture Models Gaussian Mixtures vs Deep Gaussian Mixtures Now suppose we replace z ∼ N (0 , I p ) with k 2 π (2) φ ( p ) ( z ; µ (2) , Σ (2) � f ( z ; θ ) = ) j j j j =1 This defines a Deep Gaussian Mixture Model (DGMM) with h = 2 layers. ECDA 2017 Deep GMM 17

  25. Deep Gaussian Mixture Models Deep Gaussian Mixtures Imagine h = 2, k 2 = 4 and k 1 = 2: ECDA 2017 Deep GMM 18

  26. Deep Gaussian Mixture Models Deep Gaussian Mixtures Imagine h = 2, k 2 = 4 and k 1 = 2: k = 8 possible paths (total subcomponents) M = 6 real subcomponents (shared set of parameters) ECDA 2017 Deep GMM 18

  27. Deep Gaussian Mixture Models Deep Gaussian Mixtures Imagine h = 2, k 2 = 4 and k 1 = 2: k = 8 possible paths (total subcomponents) M = 6 real subcomponents (shared set of parameters) M < k thanks to the tying ECDA 2017 Deep GMM 18

  28. Deep Gaussian Mixture Models Deep Gaussian Mixtures Imagine h = 2, k 2 = 4 and k 1 = 2: k = 8 possible paths (total subcomponents) M = 6 real subcomponents (shared set of parameters) M < k thanks to the tying Special mixtures of mixtures model (Li, 2005) ECDA 2017 Deep GMM 18

  29. Deep Gaussian Mixture Models Do we really need DGMM? Consider the k = 4 clustering problem Smile data 2 1 0 −1 −2 −2 −1 0 1 2 ECDA 2017 Deep GMM 19

  30. Deep Gaussian Mixture Models Do we really need DGMM? A deep mixture with h = 2 , k 1 = 4 , k 2 = 2 ( k = 8 paths, M = 6) ECDA 2017 Deep GMM 20

  31. Deep Gaussian Mixture Models Do we really need DGMM? A deep mixture with h = 2 , k 1 = 4 , k 2 = 2 ( k = 8 paths, M = 6) Adjusted Rand Index 0.9 0.8 0.7 0.6 0.5 0.4 kmeans pam hclust mclust msn mst deepmixt ECDA 2017 Deep GMM 20

  32. Deep Gaussian Mixture Models Do we really need DGMM? A deep mixture with h = 2 , k 1 = 4 , k 2 = 2 ( k = 8 paths, M = 6) ECDA 2017 Deep GMM 21

  33. Deep Gaussian Mixture Models Do we really need DGMM? A deep mixture with h = 2 , k 1 = 4 , k 2 = 2 ( k = 8 paths, M = 6) In the DGMM we cluster data k 1 groups ( k 1 < k ) through f ( y | z ): the remaining components in the previous layer(s) act as density approximation of global non-Gaussian components ECDA 2017 Deep GMM 21

  34. Deep Gaussian Mixture Models Do we really need DGMM? A deep mixture with h = 2 , k 1 = 4 , k 2 = 2 ( k = 8 paths, M = 6) In the DGMM we cluster data k 1 groups ( k 1 < k ) through f ( y | z ): the remaining components in the previous layer(s) act as density approximation of global non-Gaussian components Automatic tool for merging mixture components: merging is unit-dependent ECDA 2017 Deep GMM 21

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend