Fast and Simple Natural-Gradient Variational Inference with Mixture - - PowerPoint PPT Presentation

fast and simple natural gradient variational inference
SMART_READER_LITE
LIVE PREVIEW

Fast and Simple Natural-Gradient Variational Inference with Mixture - - PowerPoint PPT Presentation

Fast and Simple Natural-Gradient Variational Inference with Mixture of Exponential-family Approximations Wu Lin (UBC) June 11, 2019 Joint work with Mohammad Emtiyaz Khan (AIP, RIKEN) and Mark Schmidt (UBC) 1 / 7 Variational Inference (VI) VI


slide-1
SLIDE 1

Fast and Simple Natural-Gradient Variational Inference with Mixture of Exponential-family Approximations

Wu Lin (UBC) June 11, 2019

Joint work with Mohammad Emtiyaz Khan (AIP, RIKEN) and Mark Schmidt (UBC) 1 / 7

slide-2
SLIDE 2

Variational Inference (VI)

VI approximates the posterior p(z|D) ≈ q(z|λz) by maximizing the evidence lower bound: ELBO: max

λz

L(λz) := Eq

  • Probabilistic Model
  • log p( D
  • data

, z) − log q(z|λz)

  • where q(z) is a tractable distribution parametrized by λz.

2 / 7

slide-3
SLIDE 3

ELBO Optimization

Block-box VI (BBVI): λz ← λz + β∇λzL(λz) Natural-gradient VI (NGVI): λz ← λz + β

natural gradient

  • Fz(λz)−1∇λzL(λz)

where Fz(λz) is the Fisher information matrix of q(z|λz). Advantages of NGVI:

◮ NGVI can be simple and fast when q is in the exponential

family (e.g., Gaussian) (Khan and Lin, AI&Stats 2017). NGVI for Exp-Family: λz ← λz + β ∇mzL(λz) because ∇mzL(λz) = Fz(λz)−1∇λzL(λz).

500 1000 1500 2000

Epoch

0.50 0.75 1.00 1.25 1.50 1.75 2.00

Test log2loss Australian Gradient VI Natural-Gradient VI

500 1000 1500 2000

Epoch

0.0 0.2 0.4 0.6 0.8 1.0

Test log2loss Breast Cancer Gradient VI Natural-Gradient VI

3 / 7

slide-4
SLIDE 4

Problem Formulation

Challenges of NGVI when q(z) is not in the exponential-family :

◮ Computing Fz(λz)−1∇λzL(λz) could be complicated. ◮ Fz(λz) can be singular. ◮ Often no simple update beyond exponential family.

Our goal: perform a simple NGVI update for more flexible variational approximations (e.g., skewness, multi-modality)

5 10 15 20 1 2 3 4 5 6 7 8 9 10 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 10 -3

(a) Skew Gaussian

1

  • 7
  • 6

logit u 5 10 log P 2

  • 7
  • 6

logit u 5 10 log P 3

  • 7
  • 6

logit u 5 10 log P 4

  • 7
  • 6

logit u 5 10 log P 5

  • 7
  • 6

logit u 5 10 log P 6

  • 7
  • 6

logit u 5 10 log P 7

  • 7
  • 6

logit u 5 10 log P 8

  • 7
  • 6

logit u 5 10 log P exact

  • 7
  • 6

logit u 5 10 log P

(b) Finite Mixture of Gaussians

4 / 7

slide-5
SLIDE 5

This Work

Main Contribution: propose a new NGVI update for a class of mixture of exponential family distributions. We consider the following mixture: q(z|λ) =

  • q(z|w, λz)
  • exp-family

q(w|λw)

  • exp-family

dw We propose to use the (joint) Fisher matrix Fwz of q(w, z|λ) since: ∇mL(λ) = Fwz(λ)−1∇λL(λ) where m is the proposed expectation parameter.

◮ Proposed NGVI update: λ ← λ + β∇mL(λ)

5 / 7

slide-6
SLIDE 6

Proposed NGVI

Advantage of the proposed NGVI:

◮ Has the same cost as BBVI if computing ∇mL(λ) is easy. ◮ Is faster than BBVI.

5000 10000

Iterations

100 101 102

Test RMSE

M=32 M=32 M=32

wine BBVI(Gauss) NGVI(Skew-Gauss) BBVI(Skew-Gauss)

101 102 103 104

Iterations

10−2 10−1 100

KL(q|p) breast_cancer_scale BBVI-1 BBVI-3 BBVI-5 BBVI-10 NGVI-1 NGVI-3 NGVI-5 NGVI-10

101 102 103 104

Iterations

105 106 107

Negative ELBO

M=290

covtype_scale BBVI NGVI

Variational approximations:

◮ Finite mixture of exp-family distributions:

Mixture of Gaussians (multi-modality) Birnbaum-Saunders distribution (non-Gaussian mixture)

◮ Gaussian compound distribution:

Skew Gaussian (skewness) Normal inverse-Gaussian (heavy tails)

6 / 7

slide-7
SLIDE 7

Summary & Poster Presentation

Conclusion: a simple NGVI update for approximations outside the exp-family. Poster Presentation:

◮ This work:

Poster #217, Pacific Ballroom, Today, 6:30 PM

◮ New gradient estimators via Stein’s lemma:

“Stein’s Lemma for the Reparameterization Trick with Exponential-family Mixtures”, the workshop on Stein’s method, Saturday

7 / 7