CS 285 Instructor: Sergey Levine UC Berkeley Todays Lecture 1. - - PowerPoint PPT Presentation

cs 285
SMART_READER_LITE
LIVE PREVIEW

CS 285 Instructor: Sergey Levine UC Berkeley Todays Lecture 1. - - PowerPoint PPT Presentation

Variational Inference and Generative Models CS 285 Instructor: Sergey Levine UC Berkeley Todays Lecture 1. Probabilistic latent variable models 2. Variational inference 3. Amortized variational inference 4. Generative models: variational


slide-1
SLIDE 1

Variational Inference and Generative Models

CS 285

Instructor: Sergey Levine UC Berkeley

slide-2
SLIDE 2
  • 1. Probabilistic latent variable models
  • 2. Variational inference
  • 3. Amortized variational inference
  • 4. Generative models: variational autoencoders
  • Goals
  • Understand latent variable models in deep learning
  • Understand how to use (amortized) variational inference

Today’s Lecture

slide-3
SLIDE 3

Probabilistic models

slide-4
SLIDE 4

Latent variable models

mixture element

slide-5
SLIDE 5

Latent variable models in general

“easy” distribution (e.g., Gaussian) “easy” distribution (e.g., Gaussian) “easy” distribution (e.g., conditional Gaussian)

slide-6
SLIDE 6

Latent variable models in RL

conditional latent variable models for multi-modal policies latent variable models for model-based RL

slide-7
SLIDE 7

Other places we’ll see latent variable models

Mombaur et al. ‘09 Muybridge (c. 1870) Ziebart ‘08 Li & Todorov ‘06

Using RL/control + variational inference to model human behavior Using generative models and variational inference for exploration

slide-8
SLIDE 8

How do we train latent variable models?

slide-9
SLIDE 9

Estimating the log-likelihood

slide-10
SLIDE 10

Variational Inference

slide-11
SLIDE 11

The variational approximation

slide-12
SLIDE 12

The variational approximation

Jensen’s inequality

slide-13
SLIDE 13

A brief aside…

Entropy:

Intuition 1: how random is the random variable? Intuition 2: how large is the log probability in expectation under itself high low this maximizes the first part this also maximizes the second part (makes it as wide as possible)

slide-14
SLIDE 14

A brief aside…

KL-Divergence:

Intuition 1: how different are two distributions? Intuition 2: how small is the expected log probability of one distribution under another, minus entropy? why entropy? this maximizes the first part this also maximizes the second part (makes it as wide as possible)

slide-15
SLIDE 15

The variational approximation

slide-16
SLIDE 16

The variational approximation

slide-17
SLIDE 17

How do we use this?

how?

slide-18
SLIDE 18

What’s the problem?

slide-19
SLIDE 19

Amortized Variational Inference

slide-20
SLIDE 20

What’s the problem?

slide-21
SLIDE 21

Amortized variational inference

how do we calculate this?

slide-22
SLIDE 22

Amortized variational inference

look up formula for entropy of a Gaussian can just use policy gradient!

What’s wrong with this gradient?

slide-23
SLIDE 23

The reparameterization trick

Is there a better way? most autodiff software (e.g., TensorFlow) will compute this for you!

slide-24
SLIDE 24

Another way to look at it…

this often has a convenient analytical form (e.g., KL-divergence for Gaussians)

slide-25
SLIDE 25

Reparameterization trick vs. policy gradient

  • Policy gradient
  • Can handle both discrete and

continuous latent variables

  • High variance, requires multiple

samples & small learning rates

  • Reparameterization trick
  • Only continuous latent variables
  • Very simple to implement
  • Low variance
slide-26
SLIDE 26

Example Models

slide-27
SLIDE 27

The variational autoencoder

slide-28
SLIDE 28

Using the variational autoencoder

slide-29
SLIDE 29

Conditional models

slide-30
SLIDE 30

Examples

slide-31
SLIDE 31
  • 1. collect data
  • 2. learn embedding of image & dynamics model (jointly)
  • 3. run iLQG to learn to reach image of goal

a type of variational autoencoder with temporally decomposed latent state!

slide-32
SLIDE 32

Local models with images

slide-33
SLIDE 33

Local models with images

variational autoencoder with stochastic dynamics

slide-34
SLIDE 34

We’ll see more of this for…

Mombaur et al. ‘09 Muybridge (c. 1870) Ziebart ‘08 Li & Todorov ‘06

Using RL/control + variational inference to model human behavior Using generative models and variational inference for exploration