approximate inference
play

Approximate Inference Henrik I. Christensen Robotics & - PowerPoint PPT Presentation

Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Approximate Inference Henrik I. Christensen Robotics & Intelligent Machines @ GT Georgia Institute of Technology, Atlanta, GA


  1. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Approximate Inference Henrik I. Christensen Robotics & Intelligent Machines @ GT Georgia Institute of Technology, Atlanta, GA 30332-0280 hic@cc.gatech.edu Henrik I. Christensen (RIM@GT) Approximate Inference 1 / 36

  2. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Outline Introduction 1 Variational Inference 2 Variational Mixture of Gaussians 3 Exponential Family 4 Expectation Propagation 5 Summary 6 Henrik I. Christensen (RIM@GT) Approximate Inference 2 / 36

  3. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Introduction We often are required to estimate a (conditional) prior of the form p ( Z | X ) The solution might be intractable There might not be a close form solution 1 The integration over X or a parameter space θ might be 2 computationally challenging The set of possible outcomes might be significant/exponential 3 Two strategies Deterministic Approximation Methods 1 Stochastic Sampling (Monte Carlo Techniques) 2 Today we will talk about deterministic techniques Henrik I. Christensen (RIM@GT) Approximate Inference 3 / 36

  4. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Outline Introduction 1 Variational Inference 2 Variational Mixture of Gaussians 3 Exponential Family 4 Expectation Propagation 5 Summary 6 Henrik I. Christensen (RIM@GT) Approximate Inference 4 / 36

  5. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Variational Inference In general we have a Bayesian Model as seen earlier, ie. ln p ( X ) = ln p ( X , Z ) − ln p ( Z | X ) We can rewrite this to ln p ( X ) = L ( q ) + KL ( q || p ) where � � p ( X , Z ) � L ( q ) = q ( Z ) ln q ( Z ) � p ( Z | X ) � � KL ( q || p ) = − q ( Z ) ln q ( Z ) So L ( q ) is an estimate of the joint distribution and KL is the Kullback-Leibler comparison of q ( Z ) to p ( Z | X ). Henrik I. Christensen (RIM@GT) Approximate Inference 5 / 36

  6. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Factorized Distributions Assume for now that we can factorize Z into disjoint groups so that M � q ( Z ) = q i ( Z i ) i =1 In physics a similar model has been adopted termed mean field theory We can them optimize L(q) through a component wise optimization   � �   � L ( q ) = q i  ln p ( X , Z ) − q j  dZ i j � � = q j ln ˜ p ( X , Z j ) dZ j − q j ln q j dZ j + const where � ˜ p ( X , Z j ) = E i � = j [ln p ( X , Z )] + c = ln p ( X , Z ) q i dZ i + c i � = j Henrik I. Christensen (RIM@GT) Approximate Inference 6 / 36

  7. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Factorized distributions The optimal solution is now ln q ∗ j ( Z j ) = E i � = j [ln p ( X , Z )] + c Ie the solution where every factor minimizes the influence on L ( q ) Henrik I. Christensen (RIM@GT) Approximate Inference 7 / 36

  8. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Outline Introduction 1 Variational Inference 2 Variational Mixture of Gaussians 3 Exponential Family 4 Expectation Propagation 5 Summary 6 Henrik I. Christensen (RIM@GT) Approximate Inference 8 / 36

  9. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Variational Mixture of Gaussians We encounter mixtures of Gaussians all the time Examples are multi-wall modelling, ambiguous localization, ... We have: a set of observed data X , a set of latent variables, Z that describe the mixture Henrik I. Christensen (RIM@GT) Approximate Inference 9 / 36

  10. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Mixture of Gaussians - Modelling We can model the mixture model N K � � π z nk p ( Z | π ) = k n =1 k =1 We can also derive the observed conditional N K � � N ( x n | µ k , Λ − 1 k ) z nk p ( X | Z , µ, Λ) = n =1 k =1 We will for now assume that mixtures are modelled as diraclets K � π α 0 − 1 p ( π ) = Dir ( π | α 0 ) = C ( α 0 ) k k =1 Henrik I. Christensen (RIM@GT) Approximate Inference 10 / 36

  11. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Mixture of Gaussians - Modelling The component processes can be modelled as a Gaussian-Wishart K � N ( µ k | m 0 , ( β 0 Λ k ) − 1 ) W (Λ k | W 0 , ν 0 ) p ( µ, Λ) = p ( µ | Λ) p (Λ) = k =1 Ie a total model of z n Λ π x n µ N Henrik I. Christensen (RIM@GT) Approximate Inference 11 / 36

  12. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Mixtures of Gaussians - Variational The conditional model can be seen as p ( X , Z , π, µ, Λ) = p ( X | Z , µ, Λ) p ( Z | π ) p ( π ) p ( µ | Λ) p (Λ) Only X is observed We can now consider the selection of a distribution q ( Z , π, µ, Λ) = q ( Z ) q ( π, µ, Λ) this is clear an assumption of independence. We can use the general result of component-wise optimization ln q ∗ ( Z ) = E π,µ, Λ [ln p ( X , Z , π, µ, Λ] + const Decomposition gives us ln q ∗ ( Z ) = E π [ln p ( Z | π )] + E µ, Λ [ln p ( X | Z , µ, Λ)] + const N K � � ln q ∗ ( Z ) = z nk ln ρ nk + const n =1 k =1 Henrik I. Christensen (RIM@GT) Approximate Inference 12 / 36

  13. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Mixtures of Gaussians - Variational We can further achieve ln ρ nk = E [ln π k ]+ 1 2 E [ln | Λ k | ] − D 2 ln 2 π − 1 2 E µ k , Λ k [( x n − µ k ) T Λ k ( x n − µ k )]+ c Taking the exponential we have K N q ∗ ( Z ) ∝ � � ρ z nk nk k =1 n =1 Using normalization we arrive at K N � � q ∗ ( Z ) ∝ r z nk nk n =1 k =1 Where ρ nk r nk = � j ρ nj Henrik I. Christensen (RIM@GT) Approximate Inference 13 / 36

  14. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Mixtures of Gaussians - Variational Just as we saw for EM we can define N � = N k r nk n =1 N 1 � ¯ x k = r nk x n N k n =1 N 1 � x n ) T = r nk ( x n − ¯ x n )( x n − ¯ S k N k n =1 Henrik I. Christensen (RIM@GT) Approximate Inference 14 / 36

  15. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Mixtures of Gaussians - Parameters/Mixture Lets now consider q ( π, µ, Λ) to arrive at K k N E [ z nk ] ln N ( x n | µ k , Λ − 1 X X X ln q ∗ ( π, µ, Λ) = ln p ( π ) + ln p ( µ k , Λ k ) + E Z [ln p ( Z | π )] + ) + c k k =1 k =1 n =1 We can partition the problem into K � q ( π, µ, Λ) = q ( π ) q ( µ k , Λ k ) k =1 We can derive K K N � � � ln q ∗ ( π ) = ( α 0 − 1) ln π k + r nk ln π k + c k =1 k =1 n =1 We can now derive q ∗ ( π ) = Dir ( π | α ) where α k = α 0 + N k Henrik I. Christensen (RIM@GT) Approximate Inference 15 / 36

  16. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Mixtures of Gaussians - Parameters/Mixture We can then derive q ∗ ( µ k , Λ k ) = N ( µ k | m k , ( β k Λ k ) − 1 ) W ( λ k | W k , ν k ) where β k = β 0 + N k 1 m k = ( β 0 m 0 + N k ¯ x k ) β k β 0 N k W − 1 W − 1 x k − m 0 ) T = + N k S k + (¯ x k − m 0 )(¯ 0 K β 0 + N k ν k = ν 0 + N k + 1 Henrik I. Christensen (RIM@GT) Approximate Inference 16 / 36

  17. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Mixtures of Gaussians - Parameters We can now arrive at the parameters E µ k , Λ k [( x n − µ k ) T ( x n − µ k )] = D β − 1 + ν k ( x n − m k ) T W K ( x n − m k ) k D � ν k + 1 − i � ln ˜ � Λ k = E [ln | Λ | k | ] = ψ + D ln 2 + ln | W k | 2 i =1 ln ˜ π k = E [ln π k ] = ψ ( α k ) − ψ (ˆ α ) here ψ ( . ) which is defined as d / da ln Γ( a ) also known as the digramma function. The last two results are given by the Gauss-Wishart Henrik I. Christensen (RIM@GT) Approximate Inference 17 / 36

  18. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Mixtures of Gaussians - Parameters We can finally find the responsibilities � − 1 � r nk ∝ π k | Λ k | 1 / 2 exp 2( x n − µ k ) T Λ k ( x n − µ k ) The optimization is stepwise Estimate µ, Λ and then r nk 1 Estimate π and Z 2 Check for convergence - return to 1 if not converged 3 Henrik I. Christensen (RIM@GT) Approximate Inference 18 / 36

  19. Introduction Variational Inference Mixture of Gaussians Exponential Family Expectation Propagation Summary Mixture of Gaussians - Example 0 15 60 120 Henrik I. Christensen (RIM@GT) Approximate Inference 19 / 36

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend