Adaptive Density Estimation for Generative Models Thomas - - PowerPoint PPT Presentation

adaptive density estimation for generative models
SMART_READER_LITE
LIVE PREVIEW

Adaptive Density Estimation for Generative Models Thomas - - PowerPoint PPT Presentation

Adaptive Density Estimation for Generative Models Thomas Konstantin Karteek Cordelia Jakob Lucas Shmelkov Alahari Schmid Verbeek Now at Huawei Generative modelling Goal Given samples from target distribution p , train a model


slide-1
SLIDE 1

Adaptive Density Estimation for Generative Models

Thomas Lucas Konstantin Shmelkov∗ Karteek Alahari Cordelia Schmid Jakob Verbeek

∗ Now at Huawei

slide-2
SLIDE 2

Generative modelling

Goal Given samples from target distribution p∗, train a model pθ to match p∗

1

slide-3
SLIDE 3

Generative modelling

Goal Given samples from target distribution p∗, train a model pθ to match p∗

  • Maximum likelihood: Eval. training points under the model

1

slide-4
SLIDE 4

Generative modelling

Goal Given samples from target distribution p∗, train a model pθ to match p∗

  • Maximum likelihood: Eval. training points under the model
  • Adversarial training1: Eval. samples under (approximation of) p∗

1Ian Goodfellow et al. (2014). “Generative adversarial nets”. In: NIPS.

1

slide-5
SLIDE 5

Schematic illustration

Data Model

2

slide-6
SLIDE 6

Maximum likelihood

Data Model

3

slide-7
SLIDE 7

Maximum likelihood

Data Model

Over-generalization

3

slide-8
SLIDE 8

Maximum likelihood

Data Model

Over-generalization

Consequences

  • MLE covers full support of distribution
  • Produces unrealistic samples

3

slide-9
SLIDE 9

Adversarial training

Mode-dropping

4

slide-10
SLIDE 10

Adversarial training

Mode-dropping

Consequences

  • Production of high quality samples
  • Parts of the support are dropped

4

slide-11
SLIDE 11

Hybrid training approach

Goal

  • Explicitly optimize both dataset coverage and sample quality

5

slide-12
SLIDE 12

Hybrid training approach

Goal

  • Explicitly optimize both dataset coverage and sample quality
  • Discriminator can be seen as a learnable inductive bias

5

slide-13
SLIDE 13

Hybrid training approach

Goal

  • Explicitly optimize both dataset coverage and sample quality
  • Discriminator can be seen as a learnable inductive bias
  • Retain valid likelihood to evaluate support coverage

5

slide-14
SLIDE 14

Hybrid training approach

Goal

  • Explicitly optimize both dataset coverage and sample quality
  • Discriminator can be seen as a learnable inductive bias
  • Retain valid likelihood to evaluate support coverage

Challenges

  • Tradeoff between the two objectives: need more flexibility

5

slide-15
SLIDE 15

Hybrid training approach

Goal

  • Explicitly optimize both dataset coverage and sample quality
  • Discriminator can be seen as a learnable inductive bias
  • Retain valid likelihood to evaluate support coverage

Challenges

  • Tradeoff between the two objectives: need more flexibility
  • Limiting parametric assumptions required for tractable MLE,

e.g. Gaussianity, conditional independence

5

slide-16
SLIDE 16

Hybrid training approach

Goal

  • Explicitly optimize both dataset coverage and sample quality
  • Discriminator can be seen as a learnable inductive bias
  • Retain valid likelihood to evaluate support coverage

Challenges

  • Tradeoff between the two objectives: need more flexibility
  • Limiting parametric assumptions required for tractable MLE,

e.g. Gaussianity, conditional independence

  • Often no likelihood in pixel space2
  • 2A. Larsen et al. (2016). “Autoencoding beyond pixels using a learned similarity metric”.

In: ICML.

5

slide-17
SLIDE 17

Conditional independence

Data

6

slide-18
SLIDE 18

Conditional independence

Data

p(x|z) =

N

  • i

N(xi|µθ(z), σIn)

6

slide-19
SLIDE 19

Conditional independence

Strongly penalysed by GAN Strongly penalysed by MLE

Data

p(x|z) =

N

  • i

N(xi|µθ(z), σIn)

6

slide-20
SLIDE 20

Going beyond conditional independence

Avoiding strong parametric assumptions

  • Lift reconstruction losses into a feature space

7

slide-21
SLIDE 21

Going beyond conditional independence

Avoiding strong parametric assumptions

  • Lift reconstruction losses into a feature space
  • Deep invertible models: valid density in image space

7

slide-22
SLIDE 22

Going beyond conditional independence

Avoiding strong parametric assumptions

  • Lift reconstruction losses into a feature space
  • Deep invertible models: valid density in image space
  • Retain fast sampling for adversarial training

7

slide-23
SLIDE 23

Maximum likelihood estimation with feature targets

8

slide-24
SLIDE 24

Maximum likelihood estimation with feature targets

Amortized Variational inference in feature space: Lθ,φ,ψ(x) = −Eqφ(z|x) [ln(pθ(fψ(x)|z))] + DKL(qφ(z|x)||pθ(z))

  • Evidence lower bound in feature space

− ln

  • det ∂fψ

∂x

  • 8
slide-25
SLIDE 25

Maximum likelihood estimation with feature targets

Amortized Variational inference in feature space: Lθ,φ,ψ(x) = −Eqφ(z|x) [ln(pθ(fψ(x)|z))] + DKL(qφ(z|x)||pθ(z)) − ln

  • det ∂fψ

∂x

  • Ch. of Var.

8

slide-26
SLIDE 26

Maximum likelihood estimation with feature targets

Maximum Likelihood

Amortized Variational inference in feature space: Lθ,φ,ψ(x) = −Eqφ(z|x) [ln(pθ(fψ(x)|z))] + DKL(qφ(z|x)||pθ(z)) − ln

  • det ∂fψ

∂x

  • 8
slide-27
SLIDE 27

Maximum likelihood estimation with feature targets

Maximum Likelihood

Adv. training

Amortized Variational inference in feature space: Lθ,φ,ψ(x) = −Eqφ(z|x) [ln(pθ(fψ(x)|z))] + DKL(qφ(z|x)||pθ(z)) − ln

  • det ∂fψ

∂x

  • Adversarial training with Adaptive Density Estimation:

Ladv(pθ,ψ) = −Epθ(z)

  • ln

D(f −1

ψ (µθ(z)))

1 − D(f −1

ψ (µθ(z)))

  • Adv. update using log ratio loss

8

slide-28
SLIDE 28

Experiments on CIFAR10

Samples Real images

9

slide-29
SLIDE 29

Experiments on CIFAR10

Samples Real images Model BPD ↓ IS ↑ FID ↓ GAN WGAN-GP 7.9 SNGAN 7.4 29.3 SNGAN(R,H) 8.2 21.7 MLE VAE-IAF 3.1 3.8† 73.5† NVP 3.5 4.5† 56.8† Hybrid Ours (v1) 3.8 8.2 17.2 Ours (v2) 3.5 6.9 28.9 FlowGan 4.2 3.9

9

slide-30
SLIDE 30

Samples and real images (LSUN churches, 64 × 64)

Samples @ 4.3 BPD Real images Thank you for listening. Come see us at poster 71 :)

10