Some results on GAN dynamics Ioannis Mitliagkas Game dynamics are - - PowerPoint PPT Presentation

some results on gan dynamics
SMART_READER_LITE
LIVE PREVIEW

Some results on GAN dynamics Ioannis Mitliagkas Game dynamics are - - PowerPoint PPT Presentation

Some results on GAN dynamics Ioannis Mitliagkas Game dynamics are weird fascinating Start with optimization dynamics Optimization Smooth, differentiable cost function, L Looking for stationary (fixed) points (gradient is 0)


slide-1
SLIDE 1

Some results on GAN dynamics

Ioannis Mitliagkas

slide-2
SLIDE 2

Game dynamics are weird fascinating

slide-3
SLIDE 3

Start with optimization dynamics

slide-4
SLIDE 4

Optimization

Smooth, differentiable cost function, L → Looking for stationary (fixed) points (gradient is 0) → Gradient descent

slide-5
SLIDE 5

Optimization

Conservative vector field → Straightforward dynamics

Ferenc Huszar

slide-6
SLIDE 6

Gradient descent

Conservative vector field → Straightforward dynamics Fixed-point analysis Jacobian of operator

Hessian of objective, L

slide-7
SLIDE 7

Local convergence

Jacobian of operator

Hessian of objective, L Symmetric, real-eigenvalues

Eigenvalues of op. Jacobian If ρ(θ*)=max |λ(θ*)|<1, then fast local convergence

slide-8
SLIDE 8

Games

slide-9
SLIDE 9

Implicit generative models

  • Generative moment matching networks [Li et al. 2017]
  • Other, domain-specific losses can be used
  • Variational AutoEncoders [Kingma, Welling, 2014]
  • Autoregressive models (PixelRNN [van den Oord, 2016])
slide-10
SLIDE 10

Generative Adversarial Networks

Generator network, G Given latent code, z, produces sample G(z) Discriminator network, D Given sample x or G(z), estimates probability it is real Both differentiable

slide-11
SLIDE 11

Generative Adversarial Networks

slide-12
SLIDE 12

Games

Nash Equilibrium Smooth, differentiable L → Looking for local Nash equil. → Gradient descent → Simultaneous → Alternating

slide-13
SLIDE 13

Game dynamics

Non-conservative vector field → Rotational dynamics

slide-14
SLIDE 14

Game dynamics under gradient descent

Jacobian is non-symmetric, with complex eigenvalues → Rotations in decision space

Games demonstrate rotational dynamics.

slide-15
SLIDE 15

The Numerics of GANs

by Mescheder, Nowozin, Geiger

slide-16
SLIDE 16

A word on notation and formulation

Warning: Maximization vs minimization Step size

slide-17
SLIDE 17

Eigen-analysis, gradient descent

slide-18
SLIDE 18

The Numerics of GANs

slide-19
SLIDE 19

Make vector field “more conservative”

Idea 1: Minimize the norm of the gradient

slide-20
SLIDE 20

Idea 1: Minimize vector field norm

slide-21
SLIDE 21

Idea 2: use L as regularizer

slide-22
SLIDE 22

Idea 2: use L as regularizer

slide-23
SLIDE 23

Idea 2: use L as regularizer

slide-24
SLIDE 24

Other ways to control these rotations?

slide-25
SLIDE 25

Momentum (heavy ball, Polyak 1964)

Jacobian of momentum operator

Non-symmetric, with complex eigenvalues → Rotations in augmented state-space

slide-26
SLIDE 26

Summary

Positive momentum can be bad for adversarial games Practice that was very common when GANs were first invented. → Recent work reduced the momentum parameter. → Not an accident

slide-27
SLIDE 27

Negative Momentum for Improved Game Dynamics

Gidel, Askari Hemmat, Pezeshki, Huang, Lepriol, Lacoste-Julien, Mitliagkas AISTATS 2019

slide-28
SLIDE 28

Our results

Negative momentum is optimal on simple bilinear game Negative momentum is empirically best for certain zero sum games like “saturating GANs’’ Negative momentum values are locally preferrable near 0 on a more general class of games

slide-29
SLIDE 29

Momentum on games

Fixed point operator requires a state augmentation: (because we need previous iterate) Recall Polyak’s momentum (on top of simultaneous grad. desc.):

slide-30
SLIDE 30

Bilinear game

slide-31
SLIDE 31

“Proof by picture”

Gradient descent → Simultaneous → Alternating Momentum → Positive → Negative

slide-32
SLIDE 32

General games

slide-33
SLIDE 33

Eigen-analysis, 0 momentum

slide-34
SLIDE 34

Zero vs negative momentum

Momentum → Zero → Negative

slide-35
SLIDE 35

Negative Momentum

slide-36
SLIDE 36

Empirical results

slide-37
SLIDE 37

What happens in practice ?

Fashion MNIST:

slide-38
SLIDE 38

What happens in practice ?

CIFAR-10:

slide-39
SLIDE 39

Negative Momentum

To sum up:

  • Negative momentum seems to improve the behaviour due to

“bad” eigenvalues.

  • Optimal for a class of games
  • Empirically optimal on “saturating” GANs