Variational Russian Roulette for Variational Russian Roulette for - - PowerPoint PPT Presentation

variational russian roulette for variational russian
SMART_READER_LITE
LIVE PREVIEW

Variational Russian Roulette for Variational Russian Roulette for - - PowerPoint PPT Presentation

Variational Russian Roulette for Variational Russian Roulette for Deep Bayesian Nonparametrics Deep Bayesian Nonparametrics Kai Xu [1] Joint work with Akash Srivastava [1,2] and Charles Sutton [1,3,4] [1] University of Edinburgh [2] MIT-IBM


slide-1
SLIDE 1

Variational Russian Roulette for Variational Russian Roulette for Deep Bayesian Nonparametrics Deep Bayesian Nonparametrics

Kai Xu [1] Joint work with Akash Srivastava [1,2] and Charles Sutton [1,3,4]

[1] University of Edinburgh [2] MIT-IBM Watson AI Lab [3] Google AI [4] Alan Turing Institute

1

slide-2
SLIDE 2

tl;dr tl;dr

We train a variational autoencoder with unbounded latent dimension. The latent dimension is controlled by a sparse binary matrix with infinitely many columns, following an Indian buffet process. The actual dimensionality of the VAE is inferred during training.

2

slide-3
SLIDE 3
  • Fig. 1 Infinite VAE with an IBP prior (Chatzis, 2014; Singh et al. 2017)

How an infinite binary matrix is useful for a VAE? How an infinite binary matrix is useful for a VAE?

m is a sparse binary matrix

with finite number of rows (same as number of data) and infinitely many columns

3

slide-4
SLIDE 4

Previous work uses truncated variational approximation. Our method avoids using truncated approximation. Why a truncated variational approximation is not ideal?

  • 1. Truncation level is not easy to choose.
  • 2. Poor interaction with amortised inference.

How do we avoid truncating variational posterior at all?

Truncation-free variational approximation Truncation-free variational approximation

4

slide-5
SLIDE 5
  • 1. Introduce a new infinite variational approximation

Essentially an infinite mixture of truncated approximations

  • 2. Derive a new tractable ELBO

Essentially an infinite mixture of truncated ELBOs

  • 3. Compute an unbiased gradient estimate of the ELBO

Infinite summation is estimated by Russian roulette sampling At any time, we only retain a finite representation in memory to compute the unbiased gradient estimate of the infinite target.

RAVE: RAVE: Roulette-based Amortized Variational Expectations

Roulette-based Amortized Variational Expectations

5

slide-6
SLIDE 6

Results Results

Truncated approximation tends to activate collapsed component Dimensions convey no information Russian roulette (marked by red vertical lines) automatically truncates at right dimension Please come to check my poster @Pacific Ballroom #223

Only first few components are informative Non-informative components are still activated

6