fast and simple natural gradient variational inference
play

Fast and Simple Natural-Gradient Variational Inference with Mixture - PowerPoint PPT Presentation

Fast and Simple Natural-Gradient Variational Inference with Mixture of Exponential-family Approximations Wu Lin (UBC) June 11, 2019 Joint work with Mohammad Emtiyaz Khan (AIP, RIKEN) and Mark Schmidt (UBC) 1 / 7 Variational Inference (VI) VI


  1. Fast and Simple Natural-Gradient Variational Inference with Mixture of Exponential-family Approximations Wu Lin (UBC) June 11, 2019 Joint work with Mohammad Emtiyaz Khan (AIP, RIKEN) and Mark Schmidt (UBC) 1 / 7

  2. Variational Inference (VI) VI approximates the posterior p ( z |D ) ≈ q ( z | λ z ) by maximizing the evidence lower bound: Probabilistic Model � �� � � � ELBO: max L ( λ z ) := E q log p ( D , z ) − log q ( z | λ z ) ���� λ z data where q ( z ) is a tractable distribution parametrized by λ z . 2 / 7

  3. ELBO Optimization Block-box VI (BBVI): λ z ← λ z + β ∇ λ z L ( λ z ) Natural-gradient VI (NGVI): natural gradient � �� � F z ( λ z ) − 1 ∇ λ z L ( λ z ) λ z ← λ z + β where F z ( λ z ) is the Fisher information matrix of q ( z | λ z ). Advantages of NGVI: ◮ NGVI can be simple and fast when q is in the exponential family (e.g., Gaussian) (Khan and Lin, AI&Stats 2017). NGVI for Exp-Family: λ z ← λ z + β ∇ m z L ( λ z ) because ∇ m z L ( λ z ) = F z ( λ z ) − 1 ∇ λ z L ( λ z ). Australian Breast Cancer 2 . 00 1 . 0 Gradient VI Gradient VI 1 . 75 0 . 8 Natural-Gradient VI Natural-Gradient VI Test log 2 loss Test log 2 loss 1 . 50 0 . 6 1 . 25 0 . 4 1 . 00 0 . 2 0 . 75 0 . 50 0 . 0 0 500 1000 1500 2000 0 500 1000 1500 2000 Epoch Epoch 3 / 7

  4. Problem Formulation Challenges of NGVI when q ( z ) is not in the exponential-family : ◮ Computing F z ( λ z ) − 1 ∇ λ z L ( λ z ) could be complicated. ◮ F z ( λ z ) can be singular. ◮ Often no simple update beyond exponential family. Our goal: perform a simple NGVI update for more flexible variational approximations (e.g., skewness, multi-modality) 10 -3 10 1 2 3 5 log P log P log P 9 10 10 10 4.5 8 5 5 5 4 -7 -6 -7 -6 -7 -6 7 logit u logit u logit u 3.5 4 5 6 6 3 log P log P log P 10 10 10 5 2.5 5 5 5 4 2 -7 -6 -7 -6 -7 -6 logit u logit u logit u 3 1.5 7 8 exact log P log P log P 2 1 10 10 10 1 0.5 5 5 5 -7 -6 -7 -6 -7 -6 0 0 0 5 10 15 20 logit u logit u logit u (a) Skew Gaussian (b) Finite Mixture of Gaussians 4 / 7

  5. This Work Main Contribution: propose a new NGVI update for a class of mixture of exponential family distributions. We consider the following mixture: � q ( z | λ ) = q ( z | w , λ z ) q ( w | λ w ) d w � �� � � �� � exp-family exp-family We propose to use the (joint) Fisher matrix F wz of q ( w , z | λ ) since: ∇ m L ( λ ) = F wz ( λ ) − 1 ∇ λ L ( λ ) where m is the proposed expectation parameter. ◮ Proposed NGVI update: λ ← λ + β ∇ m L ( λ ) 5 / 7

  6. Proposed NGVI Advantage of the proposed NGVI: ◮ Has the same cost as BBVI if computing ∇ m L ( λ ) is easy. ◮ Is faster than BBVI. breast_cancer_scale 10 0 wine covtype_scale BBVI-1 10 2 10 7 BBVI(Gauss) BBVI-3 Negative ELBO NGVI(Skew-Gauss) BBVI-5 Test RMSE KL(q|p) M=290 BBVI(Skew-Gauss) BBVI-10 10 −1 BBVI 10 1 NGVI-1 10 6 M=32 M=32 M=32 NGVI NGVI-3 NGVI-5 NGVI-10 10 −2 10 0 10 5 10 1 10 2 10 3 10 4 0 5000 10000 10 1 10 2 10 3 10 4 Iterations Iterations Iterations Variational approximations: ◮ Finite mixture of exp-family distributions: Mixture of Gaussians (multi-modality) Birnbaum-Saunders distribution (non-Gaussian mixture) ◮ Gaussian compound distribution: Skew Gaussian (skewness) Normal inverse-Gaussian (heavy tails) 6 / 7

  7. Summary & Poster Presentation Conclusion: a simple NGVI update for approximations outside the exp-family. Poster Presentation: ◮ This work: Poster #217, Pacific Ballroom, Today, 6:30 PM ◮ New gradient estimators via Stein’s lemma: “Stein’s Lemma for the Reparameterization Trick with Exponential-family Mixtures”, the workshop on Stein’s method, Saturday 7 / 7

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend