Variational Inference for Dirichlet Process Mixtures By David Blei - PowerPoint PPT Presentation

Variational Inference for Dirichlet Process Mixtures By David Blei and Michael Jordan Presented by Daniel Acuna

Motivation  Non-parametric Bayesian models seem to be the right idea:  Do not fix the number of mixture components  Dirichlet process is an elegant and principled way to “automatically” set the components  Need to explore new methods that cope intractable nature of marginalization or conditional  MCMC sampling methods widely used in this context, but there are other ideas

Motivation  Variational inference have proved to be faster and more predictable (deterministic) than sampling  The basic idea  Reformulate as an optimization problem  Relax the optimization problem  Optimize (find a bound of the original problem)

Background  Dirichlet process mixture is a measure on measures  Multiples representations and interpretations:  Ferguson Existent theorem  Blackwell-MacQueen urn scheme  Chinese restaurant process  Stick-breaking construction

exhibit a clustering effect Dirichlet process mixture model  Base distribution G 0  Positive scaling parameter � { � 1 , K , � n � 1 } The DP mixture has a natural interpretation as a flexible mixture model in which the number of components is random and grows as new data are observed

Stick-breaking representation  Two infinite collections of independent random variables V i ~ Beta (1, � ) For i = {1,2,…} * ~ G 0 � i  Stir-breaking representation of G i � 1 � � i ( v ) = v i (1 � v j ) j = 1 � � G = � i ( v ) � � i * i = 1  G is discrete!

Sticking-breaking rep. The data can be described as arriving  from Draw V i | � ~ Beta (1, � ), i = {1,2,...} 1) * | G 0 ~ G 0 i = {1,2,...} Draw � i 2) For the n-th data point 3) Draw Z n |{ v 1 , v 2 ,...} ~ Mult ( � ( v )) 1) * ) Draw X n | z n ~ p ( x n | � z n 2)

DP mixture for exponential families  Observable data drawn from exponential family, the base distribution is the conjugate

Variational inf. for DP mix.  In DP, our goal  But complex  Variational inference uses a proposal distribution that breaks the dependency among latent variables

Variational inf. for DP mix.  In general, consider a model with hyperparameters , latent variables and observations x =  The posterior distribution: Difficult!

Variational inf. for DP mix  This is difficult Because latent variables become dependent when conditioning on observed data  We reformulate the problem using the mean-field method, which optimizes the KL divergence with respect to a variational distribution .

Variational inf. for DP mix  This is, we aim to minimize the KL divergence between and  Or equivalently, we try to maximize the lower bound

Mean field of exponential fam.  For each latent variable, the conditional is a member of a exponential family:  Where is the natural parameter of w i when conditioned on the remaining latent variables  Here the family of distributions is Variational parameters

Mean-field of exponential family  The optimization of KL divergence after derivation (see Apendix)  Notice:  Gibbs sampling, we draw w i from p ( w i |w -I ,x, θ )  Here, we update v i to set it equal E[ g i ( w -I ,x, θ )]

DP mixtures  The latent variables are stick lengths, atoms, and cluster assignment  The hyper parameters are the scaling and conjugate base distribution  And the bound now is

Relaxation of optimization  To exploit this bound, with family q we need to approximate G  G is an infinite-dimensional random measure.  An approximation is to truncate the stick- breaking representation!

Relaxation of optimization  Fix value T and q(v T = 1)=1, then are equal to zero for t>T  (remember from )  Propose,  Beta distributions  Exponential family distributions  Multinomial distributions

Optimization  The optimization is performed by coordinate ascent algorithm  From, Infinite!

Optimization  But, Then Where

Optimization  Finally, the mean-field coordinate ascent algorithm boils down to updates:

Predictive distribution

Empirical comparison

Conclusion  Faster than sampling for particular problems  Unlikely, that one method will dominate another  both have their pros and cons  This is the simplest variational method (mean-field). Other methods are worth exploring.  Check www.videolectures.net

Variational Inference for Dirichlet Process Mixtures By David Blei - PowerPoint PPT Presentation

Variational Inference for Dirichlet Process Mixtures By David Blei and Michael Jordan Presented by Daniel Acuna Motivation Non-parametric Bayesian models seem to be the right idea: Do not fix the number of mixture components

Lecture 14: Inference in Dirichlet Processes (Blei & Jordan, Variational inference for

Deep Variational Inference FLARE Reading Group Presentation Wesley Tansey 9/28/2016 What is

Variational Inference for GPs: Presenters Group1: Stochastic variational inference. Slides 2 - 28

The Dirichlet-Bohr radius Manuel Maestre April 13, 2014 Kent State University Content

Variational Auto-encoders 2 VARIATIONAL AUTO-ENCODERS INTRODUCTION VARIATIONAL AUTO-ENCODERS

Rejection Sampling Variational Inference Karan Grewal CSC2547 / STA4273 Overview Variational

CS480/680 Machine Learning Lecture 11: February 11 th , 2020 Variational Inference Zahra

Memoized Online Variational Inference for Dirichlet Process Mixture Models Michael C. Hughes

Perspective Hierarchical Dirichlet Process for Perspective Hierarchical Dirichlet Process for

Analysis of a model of elastic plastic mixtures (Prandtl-Reuss-mixtures) Project of Josef

An Introduction to An Introduction to Variational Variational Methods for Graphical Models

Hierarchical Dirichlet Processes Presenters: Micah Hodosh, Yizhou Sun 4/7/2010 1 Content

Lecture Variational 13 Inference Panini Kaushal Scribes : - Margulies Smedeuranh Niklas

CS 285 Instructor: Sergey Levine UC Berkeley Todays Lecture 1. Probabilistic latent variable

Variational Inference for Bayes vMF Mixture Hanxiao Liu September 23, 2014 1 / 14 Variational

Dirichlet process mixtures are inconsistent for the number of components in a finite mixture

Program Information as of March 2020 Mel Coryell - DP Coordinator Jessica Carlson- CP Coordinator

Academy of Art University Planning Commission Hearing November 21, 2019 Global Resolution

Focussing on a sustainable future Jo da Silva OBE Arup Fellow / Global Sustainable Development

Mazo de la Roche PS Staff and School Council Presentation November 7, 2018 Mazo de la Roche PS

Welcome to 9 th Grade Parent Night: Introduction to the IB Program What are the IB approaches to

European Parliament The r-DP Methods Summary The concept of Degressive Proportionality

Is your company ready for GDPR? DPOrganizer is an online tool that helps you map, visualize,

Community Living BC: Enhanced Approach to Planning Presentation for Inclusion Outreach District

Variational Inference for Dirichlet Process Mixtures By David Blei - PowerPoint PPT Presentation

Variational Inference for Dirichlet Process Mixtures By David Blei and Michael Jordan Presented by Daniel Acuna Motivation Non-parametric Bayesian models seem to be the right idea: Do not fix the number of mixture components

Lecture 14: Inference in Dirichlet Processes (Blei &amp; Jordan, Variational inference for

Deep Variational Inference FLARE Reading Group Presentation Wesley Tansey 9/28/2016 What is

Variational Inference for GPs: Presenters Group1: Stochastic variational inference. Slides 2 - 28

The Dirichlet-Bohr radius Manuel Maestre April 13, 2014 Kent State University Content

Variational Auto-encoders 2 VARIATIONAL AUTO-ENCODERS INTRODUCTION VARIATIONAL AUTO-ENCODERS

Rejection Sampling Variational Inference Karan Grewal CSC2547 / STA4273 Overview Variational

CS480/680 Machine Learning Lecture 11: February 11 th , 2020 Variational Inference Zahra

Memoized Online Variational Inference for Dirichlet Process Mixture Models Michael C. Hughes

Perspective Hierarchical Dirichlet Process for Perspective Hierarchical Dirichlet Process for

Analysis of a model of elastic plastic mixtures (Prandtl-Reuss-mixtures) Project of Josef

An Introduction to An Introduction to Variational Variational Methods for Graphical Models

Hierarchical Dirichlet Processes Presenters: Micah Hodosh, Yizhou Sun 4/7/2010 1 Content

Lecture Variational 13 Inference Panini Kaushal Scribes : - Margulies Smedeuranh Niklas

CS 285 Instructor: Sergey Levine UC Berkeley Todays Lecture 1. Probabilistic latent variable

Variational Inference for Bayes vMF Mixture Hanxiao Liu September 23, 2014 1 / 14 Variational

Dirichlet process mixtures are inconsistent for the number of components in a finite mixture

Program Information as of March 2020 Mel Coryell - DP Coordinator Jessica Carlson- CP Coordinator

Academy of Art University Planning Commission Hearing November 21, 2019 Global Resolution

Focussing on a sustainable future Jo da Silva OBE Arup Fellow / Global Sustainable Development

Mazo de la Roche PS Staff and School Council Presentation November 7, 2018 Mazo de la Roche PS

Welcome to 9 th Grade Parent Night: Introduction to the IB Program What are the IB approaches to

European Parliament The r-DP Methods Summary The concept of Degressive Proportionality

Is your company ready for GDPR? DPOrganizer is an online tool that helps you map, visualize,

Community Living BC: Enhanced Approach to Planning Presentation for Inclusion Outreach District

Lecture 14: Inference in Dirichlet Processes (Blei & Jordan, Variational inference for