Bridging Theory and Practice of GANs MIX+GAN Ian Goodfellow, Sta ff - - PowerPoint PPT Presentation

bridging theory and practice of gans
SMART_READER_LITE
LIVE PREVIEW

Bridging Theory and Practice of GANs MIX+GAN Ian Goodfellow, Sta ff - - PowerPoint PPT Presentation

ID-CGAN LR-GAN MedGAN Progressive GAN CGAN IcGAN A ff GAN DiscoGAN LS-GAN b-GAN LAPGAN MPM-GAN AdaGAN CoGAN iGAN SN-GAN AMGAN LSGAN InfoGAN IAN CatGAN Bridging Theory and Practice of GANs MIX+GAN Ian Goodfellow, Sta ff Research


slide-1
SLIDE 1

Ian Goodfellow, Staff Research Scientist, Google Brain NIPS 2017 Workshop: Deep Learning: Bridging Theory and Practice Long Beach, 2017-12-09

Bridging Theory and Practice of GANs

3D-GAN AC-GAN AdaGAN AffGAN AL-CGAN ALI AMGAN AnoGAN ArtGAN b-GAN Bayesian GAN BEGAN BiGAN B-GAN CGAN CCGAN CatGAN CoGAN Context-RNN-GAN C-RNN-GAN C-VAE-GAN CycleGAN DTN DCGAN DiscoGAN DR-GAN DualGAN EBGAN f-GAN FF-GAN GAWWN GoGAN GP-GAN IAN iGAN IcGAN ID-CGAN InfoGAN LAPGAN LR-GAN LS-GAN LSGAN MGAN MAGAN MAD-GAN MalGAN MARTA-GAN McGAN MedGAN MIX+GAN MPM-GAN GMAN alpha-GAN WGAN-GP DRAGAN Progressive GAN SN-GAN

slide-2
SLIDE 2

(Goodfellow 2017)

Generative Modeling

  • Density estimation
  • Sample generation

Training examples Model samples

slide-3
SLIDE 3

(Goodfellow 2017)

Adversarial Nets Framework

(Goodfellow et al., 2014)

x sampled from data Differentiable function D D(x) tries to be near 1 Input noise z Differentiable function G x sampled from model D D tries to make D(G(z)) near 0, G tries to make D(G(z)) near 1

slide-4
SLIDE 4

(Goodfellow 2017)

How long until GANs can do this?

Training examples Model samples

slide-5
SLIDE 5

(Goodfellow 2017)

Progressive GANs

(Karras et al., 2017)

slide-6
SLIDE 6

(Goodfellow 2017)

Spectrally Normalized GANs

(Miyato et al., 2017) Welsh Springer Spaniel Palace Pizza

slide-7
SLIDE 7

(Goodfellow 2017)

Building a bridge from simple to complex theoretical models

GANs in pdf space GANs in generator function spaceParameterized GANs Finite sized GANs Limited precision GANs

slide-8
SLIDE 8

(Goodfellow 2017)

Building a bridge from intuition to theory

Basic idea of GANs Is there an equilibrium? Is it in the right place? Do we converge to it? How quickly?

slide-9
SLIDE 9

(Goodfellow 2017)

Building the bridge

GANs in pdf space

GANs in generator function space Parameterized GANs Finite sized GANs Limited precision GANs

slide-10
SLIDE 10

(Goodfellow 2017)

Optimizing over densities

z x

(Goodfellow et al, 2014) Data samples D(x) generator density generator function

slide-11
SLIDE 11

(Goodfellow 2017)

Tips and Tricks

  • A good strategy to simplify a model for theoretical

purposes is to work in function space.

  • Binary or linear models are often too different from

neural net models to provide useful theory.

  • Use convex analysis in this function space.
slide-12
SLIDE 12

(Goodfellow 2017)

Results

  • Goodfellow et al 2014:
  • Nash equilibrium exists
  • Nash equilibrium corresponds to recovering data-

generating distribution

  • Nested optimization converges
  • Kodali et al 2017: simultaneous SGD converges
slide-13
SLIDE 13

(Goodfellow 2017)

Building a bridge from simple to complex theoretical models

GANs in pdf space

GANs in generator function space

Parameterized GANs Finite sized GANs Limited precision GANs

slide-14
SLIDE 14

(Goodfellow 2017)

Non-Equilibrium Mode Collapse

  • D in inner loop: convergence to correct distribution
  • G in inner loop: place all mass on most likely point

min

G max D V (G, D) 6= max D min G V (G, D)

(Metz et al 2016)

slide-15
SLIDE 15

(Goodfellow 2017)

Equilibrium mode collapse

x z x z Mode collapse Neighbors in generator function space are worse (Appendix A1 of Unterthiner et al, 2017)

slide-16
SLIDE 16

(Goodfellow 2017)

Building a bridge from simple to complex theoretical models

GANs in pdf space GANs in generator function spaceParameterized

GANs

Finite sized GANs Limited precision GANs

slide-17
SLIDE 17

(Goodfellow 2017)

Simple Non-convergence Example

  • For scalar x and y, consider the value function:
  • Does this game have an equilibrium? Where is it?
  • Consider the learning dynamics of simultaneous

gradient descent with infinitesimal learning rate (continuous time). Solve for the trajectory followed by these dynamics.

V (x, y) = xy ∂x ∂t = − ∂ ∂xV (x(t), y(t)) ∂y ∂t = ∂ ∂y V (x(t), y(t))

slide-18
SLIDE 18

(Goodfellow 2017)

Solution

This is the canonical example of a saddle point. There is an equilibrium, at x = 0, y = 0.

slide-19
SLIDE 19

(Goodfellow 2017)

Solution

  • The dynamics are a circular orbit:

x(t) = x(0) cos(t) − y(0) sin(t) y(t) = x(0) sin(t) + y(0) cos(t) Discrete time gradient descent can spiral

  • utward for large

step sizes

slide-20
SLIDE 20

(Goodfellow 2017)

Tips and Tricks

  • Use nonlinear dynamical systems theory to study

behavior of optimization algorithms

  • Demonstrated and advocated especially by

Nagarajan and Kolter 2017

slide-21
SLIDE 21

(Goodfellow 2017)

Results

  • The good equilibrium is a stable fixed point (Nagarajan

and Kolter, 2017)

  • Two-timescale updates converge (Heusel et al, 2017)
  • Their recommendation: use a different learning rate

for G and D

  • My recommendation: decay your learning rate for G
  • Convergence is very inefficient (Mescheder et al, 2017)
slide-22
SLIDE 22

(Goodfellow 2017)

Intuition for the Jacobian

g(1) g(2) θ(1) H(1) rθ(1)g(2) θ(2) rθ(2)g(1) H(2)

How firmly does player 1 want to stay in place? How firmly does player 2 want to stay in place? How much can player 2 dislodge player 1? How much can player 1 dislodge player 2?

slide-23
SLIDE 23

(Goodfellow 2017)

What happens for GANs?

g(1) g(2) θ(1) H(1) rθ(1)g(2) θ(2) rθ(2)g(1) H(2)

D D

G

G

All zeros! The optimal discriminator is constant. Locally, the generator does not have any “retaining force”

slide-24
SLIDE 24

(Goodfellow 2017)

Building a bridge from simple to complex theoretical models

GANs in pdf space GANs in generator function spaceParameterized GANs

Finite sized GANs Limited precision GANs

slide-25
SLIDE 25

(Goodfellow 2017)

Does a Nash equilibrium exist, in the right place?

  • PDF space: yes
  • Generator function space: yes, but there can also be bad

equilibria

  • What about for neural nets with a finite number of finite-

precision parameters?

  • Arora et al, 2017: yes… for mixtures
  • Infinite mixture
  • Approximate an infinite mixture with a finite mixture
slide-26
SLIDE 26

(Goodfellow 2017)

Open Challenges

  • Design an algorithm that avoids bad equilibria in

generator function space OR reparameterize the problem so that it does not have bad equilibria

  • Design an algorithm that converges rapidly to the

equilibrium

  • Study the global convergence properties of the

existing algorithms