Simplifying mixtures of Parzen windows GRETSI 2011, Bordeaux, France - PowerPoint PPT Presentation

Mixture Models Simplification Software library Simplifying mixtures of Parzen windows GRETSI 2011, Bordeaux, France Olivier Schwander Frank Nielsen École Polytechnique September 6, 2011 Olivier Schwander Simplifying mixtures of Parzen windows

Mixture Models Simplification Software library Outline Mixture Models Statistical mixtures Getting mixtures Simplification k -means One-step clustering Experiments Software library Presentation Olivier Schwander Simplifying mixtures of Parzen windows

Mixture Models Statistical mixtures Simplification Getting mixtures Software library Mixture models Mixture ◮ Pr ( X = x ) = � i ω i Pr ( X = x | µ i , Θ i ) ◮ each Pr ( X = x | µ i , Θ i ) is a probability density function Famous special case Gaussian Mixtures Models (GMM) Olivier Schwander Simplifying mixtures of Parzen windows

Mixture Models Statistical mixtures Simplification Getting mixtures Software library Getting mixtures Expectation-Maximization Kernel density estimation ◮ or Parzen windows methods ◮ one Kernel by data point (often a Gaussian kernel) ◮ fixed bandwidth 200 0.012 0.010 150 0.008 100 0.006 0.004 50 0.002 0 0.000 0 50 100 150 200 250 0 50 100 150 200 250 Olivier Schwander Simplifying mixtures of Parzen windows

Mixture Models Statistical mixtures Simplification Getting mixtures Software library Why simplification ? A lot of components ◮ 120 × 120 = 14400 Gaussian in the previous curve KDE: good approximation but ◮ very large mixture: time and memory problems ◮ low number of components is often enough (EM) EM: small approximation but We may want a fixed number of components without learning a new mixture ◮ EM is slow ◮ we don’t have the original dataset, just the model Olivier Schwander Simplifying mixtures of Parzen windows

Mixture Models k -means Simplification One-step clustering Software library Experiments k-means 200 150 100 50 0 0 50 100 150 200 250 0.012 0.012 0.010 0.010 0.008 0.008 0.006 0.006 0.004 0.004 0.002 0.002 0.000 0.000 0 50 100 150 200 250 0 50 100 150 200 250 300 Olivier Schwander Simplifying mixtures of Parzen windows

Mixture Models k -means Simplification One-step clustering Software library Experiments k-means What do we need ? ◮ A distance (or a divergence, or a dissimilarity measure) ◮ A centroid Olivier Schwander Simplifying mixtures of Parzen windows

Mixture Models k -means Simplification One-step clustering Software library Experiments Kullback-Liebler divergence Divergence � p ( x ) log p ( x ) ◮ D ( P � Q ) = q ( x ) d x ◮ Not a symmetric divergence Centroids ◮ Left-sided one: min x � i ω i B F ( x , p i ) ◮ Right-sided one: min x � i ω i B F ( p i , x ) ◮ Various symmetrizations ! ◮ Known in closed-form Olivier Schwander Simplifying mixtures of Parzen windows

Mixture Models k -means Simplification One-step clustering Software library Experiments Fisher divergence Riemannian metric on the statistical manifold Fisher information matrix �� ∂ � � ∂ �� log p ( X ; � log p ( X ; � g ij = I ( θ i , θ j ) = E θ ) θ ) ∂θ i ∂θ j � � ds = g ij d θ i d θ j Olivier Schwander Simplifying mixtures of Parzen windows

Mixture Models k -means Simplification One-step clustering Software library Experiments Fisher divergence formula Known for 0-mean Gaussian ◮ Not really interesting for mixtures. . . ◮ Open problem for others cases For 1D data ◮ Poincaré hyperbolic distance in the Poincaré upper half-plane FRD ( f p , f q ) = � � � � � ( µ p 2 , σ p ) − ( µ q � ( µ p 2 , σ p ) − ( µ p 2 , σ q ) � + 2 , σ p ) √ � � � � √ √ √ √ � 2 ln � � � � � ( µ p 2 , σ p ) − ( µ p � ( µ p 2 , σ p ) − ( µ p 2 , σ p ) � − 2 , σ p ) � � � � √ √ √ √ � Olivier Schwander Simplifying mixtures of Parzen windows

Mixture Models k -means Simplification One-step clustering Software library Experiments Fisher centroids No closed-form formula ◮ even for 1D Gaussian ◮ brute-force search for the minimizer ? not very elegant Olivier Schwander Simplifying mixtures of Parzen windows

Mixture Models k -means Simplification One-step clustering Software library Experiments Model centroids Centroid in constant curvature spaces ω 1 p ′ 1 + ω 2 p ′ 2 ◮ from Poincaré upper half-plane to Poincaré disk ◮ from Poincaré disk to Klein disk Minkowski model ω 2 p ′ 2 ◮ from Klein disk to Minkowski ω 1 p ′ p ′ 1 2 model p ′ 1 c ′ Klein disk ◮ Center of Mass and c p 1 p 2 renormalization ◮ from Minkowski model to O . . . to Poincaré upper half-plane Galperin. A concept of the mass center of a system of material points in the constant curvature spaces. 1993 Olivier Schwander Simplifying mixtures of Parzen windows

Mixture Models k -means Simplification One-step clustering Software library Experiments One-step clustering What are we looking for ? ◮ the best model ? Failure, we just have a local minimum... ◮ a good enough model ? which constraints ? What happens if we don’t do the iterations of the k-means ? ◮ Faster ! ◮ Quality ? Olivier Schwander Simplifying mixtures of Parzen windows

Mixture Models k -means Simplification One-step clustering Software library Experiments Experiments: log-likelihood ◮ EM and k-means with KL are very good even no matter the number of components ◮ k-means with model centroids and one-step k-means with model centroids just need a little more components Olivier Schwander Simplifying mixtures of Parzen windows

Mixture Models k -means Simplification One-step clustering Software library Experiments Experiments: time ◮ KL even slower than EM (closed-form formula does not mean cheap computation) ◮ one-step clustering really fast, with good quality Olivier Schwander Simplifying mixtures of Parzen windows

Mixture Models k -means Simplification One-step clustering Software library Experiments Bioinformatics application: prediction of RNA 3D structure Previous work ◮ Direchlet process mixtures ◮ High quality models but too slow Original data KDE and simplified KDE EM and simplified EM joint work with A. SIM, M. LEVITT and J. BERNAUER, INRIA and Stanford Olivier Schwander Simplifying mixtures of Parzen windows

Mixture Models Simplification Presentation Software library pyMEF: a Python library for Exponential families Manipulation of mixture of EF ◮ direct creation of mixtures ◮ learning of mixtures: Bregman soft clustering ◮ simplification of mixtures: Bregman hard clustering, Model hard clustering ◮ vizualization Goals ◮ generic framework for EF (and Information Geometry) ◮ rapid prototyping (Python shell) Olivier Schwander Simplifying mixtures of Parzen windows

Mixture Models Simplification Presentation Software library Conclusion A better way to get mixtures ◮ Compact mixtures ◮ Fast to learn ◮ Fast to use One-step clustering ◮ Would need to be validated by a real application pyMEF ◮ A library for all that ◮ With release soon (hopefully) ◮ http://www.lix.polytechnique.fr/~schwander/pyMEF Olivier Schwander Simplifying mixtures of Parzen windows

Simplifying mixtures of Parzen windows GRETSI 2011, Bordeaux, France - PowerPoint PPT Presentation

Mixture Models Simplification Software library Simplifying mixtures of Parzen windows GRETSI 2011, Bordeaux, France Olivier Schwander Frank Nielsen cole Polytechnique September 6, 2011 Olivier Schwander Simplifying mixtures of Parzen

Non-Local Manifold Parzen Windows Yoshua Bengio, Hugo Larochelle and Pascal Vincent D

Windows Not just for houses Windows 1-10 Windows Server Essentially a jacked up windows 8 box

Platform Convergence Journey Windows Embedded Standard 7 Windows Embedded Standard 8 Converged

Windows 8 Heap Internals Windows 8 Heap Internals Windows 8 Heap Internals INTRODUCTION Windows 8

1. 2. 3. 1. 2. 3. Windows 10 IoT Core Universal Windows Platform (UWP) Microsoft Azure v7

Windows Not Just For Houses Everyone Uses Windows! Versions of Windows 10 There are multiple

Analysis of a model of elastic plastic mixtures (Prandtl-Reuss-mixtures) Project of Josef

Module 1 Overview of Windows 10 Module Overview Introduction to Windows 10 Implementing

SI SI SIMPLIFYING DESIGN SIMPLIFYING DESIGN MPLIFYING DESIGN MPLIFYING DESIGN TO

Simplifying Economic Development in Lombard Presentation to the Lombard Area Chamber of

Release granular mushrooms Release granular mushrooms and dried mixtures and dried mixtures

The science of mixtures and separation techniques Rahul Bhambure PhD Scientist, Chemical

Mixtures of models Michel Bierlaire michel.bierlaire@epfl.ch Transport and Mobility Laboratory

Windows Presentation Foundation 45 Cookbook Yosifovich Pavel Windows Presentation Foundation 45

Windows Presentation Foundation 45 Cookbook Pavel Page 1/123 1029224 Windows Presentation

Windows Server 2003 Windows Server 2008 Windows Server 2012 Hardwar are Innovat ation ion

Task scheduling over Heterogeneous Multicore Machines: a Runtime Perspective Raymond Namyst

Immigration in the Party Political Agenda: A Comparative Analysis of Party Manifestos in Six

Agent-based macroeconomics: Methods, myths and models First Bordeaux Workshop on Agent-Based

More Ways More Life Lessons from Autonomous Shuttles & Multi-Modal Shifts March 2018

Early repolarization Early repolarization : Recognition and Management Defined on Baseline ECGs

Long-time Behaviour of a Model of Rigid Structure Floating in a Viscous Fluid G.

Camera Motion Identification in the Rough Indexing Paradigm Petra KRMER and Jenny BENOIS-PINEAU

Eulerian orientations and the six-vertex model on planar maps Andrew Elvey Price Joint work with

Simplifying mixtures of Parzen windows GRETSI 2011, Bordeaux, France - PowerPoint PPT Presentation

Mixture Models Simplification Software library Simplifying mixtures of Parzen windows GRETSI 2011, Bordeaux, France Olivier Schwander Frank Nielsen cole Polytechnique September 6, 2011 Olivier Schwander Simplifying mixtures of Parzen

Non-Local Manifold Parzen Windows Yoshua Bengio, Hugo Larochelle and Pascal Vincent D

Windows Not just for houses Windows 1-10 Windows Server Essentially a jacked up windows 8 box

Platform Convergence Journey Windows Embedded Standard 7 Windows Embedded Standard 8 Converged

Windows 8 Heap Internals Windows 8 Heap Internals Windows 8 Heap Internals INTRODUCTION Windows 8

1. 2. 3. 1. 2. 3. Windows 10 IoT Core Universal Windows Platform (UWP) Microsoft Azure v7

Windows Not Just For Houses Everyone Uses Windows! Versions of Windows 10 There are multiple

Analysis of a model of elastic plastic mixtures (Prandtl-Reuss-mixtures) Project of Josef

Module 1 Overview of Windows 10 Module Overview Introduction to Windows 10 Implementing

SI SI SIMPLIFYING DESIGN SIMPLIFYING DESIGN MPLIFYING DESIGN MPLIFYING DESIGN TO

Simplifying Economic Development in Lombard Presentation to the Lombard Area Chamber of

Release granular mushrooms Release granular mushrooms and dried mixtures and dried mixtures

The science of mixtures and separation techniques Rahul Bhambure PhD Scientist, Chemical

Mixtures of models Michel Bierlaire michel.bierlaire@epfl.ch Transport and Mobility Laboratory

Windows Presentation Foundation 45 Cookbook Yosifovich Pavel Windows Presentation Foundation 45

Windows Presentation Foundation 45 Cookbook Pavel Page 1/123 1029224 Windows Presentation

Windows Server 2003 Windows Server 2008 Windows Server 2012 Hardwar are Innovat ation ion

Task scheduling over Heterogeneous Multicore Machines: a Runtime Perspective Raymond Namyst

Immigration in the Party Political Agenda: A Comparative Analysis of Party Manifestos in Six

Agent-based macroeconomics: Methods, myths and models First Bordeaux Workshop on Agent-Based

More Ways More Life Lessons from Autonomous Shuttles &amp; Multi-Modal Shifts March 2018

Early repolarization Early repolarization : Recognition and Management Defined on Baseline ECGs

Long-time Behaviour of a Model of Rigid Structure Floating in a Viscous Fluid G.

Camera Motion Identification in the Rough Indexing Paradigm Petra KRMER and Jenny BENOIS-PINEAU

Eulerian orientations and the six-vertex model on planar maps Andrew Elvey Price Joint work with

More Ways More Life Lessons from Autonomous Shuttles & Multi-Modal Shifts March 2018