From Variational to Deterministic Autoencoders or the joys of - PowerPoint PPT Presentation

From Variational to Deterministic Autoencoders or the joys of density estimation in latent spaces Antonio Vergari Joint work with: Partha Ghosh, Mehdi S.M. Sajjadi , Bernhard Schölkopf, Michael Black University of California, Los Angeles @tetraduzione 26th August 2020 - UCL - AI Center Seminars

Why? learning time

Why? learning time inference time the generative modeling paradigm

Variational Autoencoders (VAEs) Generative modeling [Van Den Oord2017, Tolstikhin2019, Razavi2019,...] ⇒ ⇒ Density Estimation [Kingma2014,Rezende2014,Burda2015,...] Disentanglement [Higgings2016, ...] ⇒ Kingma, Diederik P., and Max Welling. " Auto-encoding variational bayes. " ICLR 2014

Variational Autoencoders Regularized Autoencoders (VAEs) (RAEs) Generative modeling [Van Den Oord2017, Tolstikhin2019, Razavi2019,...] ⇒ ⇒ Density Estimation [Kingma2014,Rezende2014,Burda2015,...] Disentanglement [Higgings2016, ...] ⇒ a simpler alternative for generative modeling

! disclaimer !

Variational Autoencoders (VAEs)

How to train VAEs?

Training VAEs: issues Balancing reconstruction quality and compression [Burda et al. 2015, Tolshkin et al. 2018, ...] Spurious global optima [Dai et al. 2019] Posterior collapse [van den Oord et al. 2018, ...] Prior/aggregate posterior mismatch [Tolshkin et al. 2018, Dai et al. 2019, ...]

Issue #1: balancing training

Issue #1: balancing training one sample approximation!

Issue #1: balancing training Weighting the KL term!

Sampling VAEs

Sampling VAEs the aggregate posterior should ideally match the prior!

Issue #2: sampling spurious codes the prior/aggregate posterior mismatch

Issue #2: sampling spurious codes the decoder has a hard time “imagining”

Can we do better?

Simpler VAEs?

How to have a smooth latent space? ideally,

Regularized Autoencoders (RAEs)!

Which regularization for RAEs?

Which regularization for RAEs? Gradient penalization [Gulrajani et al. 2017; Mescheder et al. 2018]

Which regularization for RAEs? Gradient penalization [Gulrajani et al. 2017; Mescheder et al. 2018] Spectral normalization [Miyato et al. 2018]

Which regularization for RAEs? Gradient penalization [Gulrajani et al. 2017; Mescheder et al. 2018] Spectral normalization [Miyato et al. 2018] Weight decay [Bishop et al. 1996]

RAE for image generation VAE RAE +L2 RAEs generate equally good or better samples and interpolations

RAE for image generation VAE RAE +L2 AE even when regularization is implicit!

Common image benchmarks: MNIST

Common image benchmarks: CIFAR10

Common image benchmarks: CelebA

How do we sample from RAEs…?

Sampling RAEs…?

Ex-Post Density Estimation (XPDE)

Which density estimator for XPDE?

Which density estimator for XPDE? a SOTA deep generative model e.g. autoregressive model or Flow [van den Oord et al. 2019, Razavi et al. 2020] ...or another VAE! VAE training and sampling issues ...are still there! ⇒

Which density estimator for XPDE? striving for simplicity: just Gaussian Mixture Models

Can’t we just do XPDE for VAEs?

Ex-Post Density Estimation (XPDE) XPDE consistently improves sample quality for all VAE variants

Why...does it work?

Why...does it work? ConvNets are very, very, very smooth ! [LeCun et al. 1994]

Why...does it work? ConvNets are very, very, very smooth ! [LeCun et al. 1994] ...and these datasets are full, full, full of regularities !

What about more challenging data? E.g., generating structured objects like molecules

VAEs for molecules? Molecule VAE [Bombardelli et al. 2017] ⇒ GrammarVAE (GVAE) [Kusner et al. 2019] ⇒ Constrained Graph VAE ( CGVAE) [Liu et al. 2018, ...] ⇒ ... ⇒

GRAE: RAEifying the Grammar VAE More accurate generation than Kusner et al. 2017

RAEify your VAEs! VAE RAE

Is this really simple… and new?

AEs for generative modeling MCMC schemes to sample from Contractive [Rifai et al. 2011] and Denoising Autoencoders [Bengio et al. 2009]

Other flavours of XPDE Two-Stage VAEs [Dai et al. 2019] use another VAE for XPDE VAE training and sampling issues ⇒ ...are still there! VQ-VAEs [van den Oord et al. 2019, Razavi et al. 2020] use PixelCNN over discrete latents VQ-VAEs are RAEs not VAEs! ⇒

What did we lose?

What did we lose? Variational Autoencoders Regularized Autoencoders (VAEs) (RAEs) Generative modeling ✓ Generative modeling ✓ ⇒ ⇒ Density Estimation ✓ Density Estimation ? ⇒ ⇒ Disentanglement ✓ Disentanglement ? ⇒ ⇒

RAEs for density estimation ? RAEs (and VQ-VAEs) are like GANs, they are implicit likelihood models!

RAEs for density estimation (?) RAEs (and VQ-VAEs) are like GANs, they are implicit likelihood models! An approximate ELBO can be recovered under some geometric assumptions

RAEs for disentanglement (?)

Conclusions

aiPhones Phone capabilities ⇒ aiCloud, aiWatch, aiTunes,... ⇒ 4k Video, ... ⇒

aiPhones RegularPhone Phone capabilities ⇒ aiCloud, aiWatch, aiTunes,... ⇒ 4k Video, ... ⇒ what is the simplest model that gets you further?

Takeaway #1: RAEify your VAEs! VAE RAE

Takeaway #2: use XPDE! Boost your VAEs by training a density estimator on the latent codes!

Paper https://openreview.net/forum?id=S1g7tpEYDS Code https://github.com/ParthaEth/Regularized_autoencoders-RAE-

From Variational to Deterministic Autoencoders or the joys of - PowerPoint PPT Presentation

From Variational to Deterministic Autoencoders or the joys of density estimation in latent spaces Antonio Vergari Joint work with: Partha Ghosh, Mehdi S.M. Sajjadi , Bernhard Schlkopf, Michael Black University of California, Los Angeles

Variational Laplace Autoencoders Yookoon Park, Chris Dongjoo Kim and Gunhee Kim Vision and

Variational Autoencoders Tom Fletcher March 25, 2019 Talking about this paper: Diederik Kingma

CSC421/2516 Lecture 17: Variational Autoencoders Roger Grosse and Jimmy Ba Roger Grosse and

Semi-Amortized Variational Autoencoders Yoon Kim Sam Wiseman Andrew Miller David Sontag

Variational Auto-encoders 2 VARIATIONAL AUTO-ENCODERS INTRODUCTION VARIATIONAL AUTO-ENCODERS

An Introduction to An Introduction to Variational Variational Methods for Graphical Models

CS598LAZ - Variational Autoencoders Raymond Yeh, Junting Lou, Teck-Yian Lim Outline - Review

LUC HENDRIKS RADBOUD UNIVERSITY, NIJMEGEN (NL) VARIATIONAL

Disentangling Disentanglement in Variational Autoencoders ICML 2019 June 12, 2019 Departments

CSC321 Lecture 20: Autoencoders Roger Grosse Roger Grosse CSC321 Lecture 20: Autoencoders 1 /

Lecture 25: Autoencoders Kernel PCA Aykut Erdem January 2017 Hacettepe University Today

Training Deterministic Parsers with Non-Deterministic Oracles by Yoav Goldberg and Joakim

CSCE 496/896 Lecture 5: Stephen Scott Autoencoders Introduction Basic Idea Stacked AE Stephen

Unsupervised Learning There is no direct ground truth for the quantity of interest

Deep Variational Inference FLARE Reading Group Presentation Wesley Tansey 9/28/2016 What is

Variational Inference for GPs: Presenters Group1: Stochastic variational inference. Slides 2 - 28

Interference of Lexico-Syntactic Gender in Bilingual Spoken-Word Recognition: An Eye-Tracking

Precision Higgs physics: a gateway to New Physics Jonas M. Lindert SM@LHC 2018 Higgs-session

Perl 6 Edward Higgins 2019-03-04 Perl 6 A language for the 21st Century [ Originally presented

Resummation in PDF fj ts Luca Rottoli Rudolf Peierls Centre for Theoretical Physics, University

Shortcuts through Colocation Facilities Vasileios Kotronis 1 , George Nomikos 1 , Lefteris

Regression Albert Bifet May 2012 COMP423A/COMP523A Data Stream Mining Outline 1. Introduction

Text analysis Natural Language Processing, or How to do cool stuff with words. Emily Rae

Dynamic Micro Targeting: Fitness- Based Approach to Predicting Individual Preferences Tianyi