Generative Adversarial Networks Phillip Isola 9.520 10/17/18 - - PowerPoint PPT Presentation

generative adversarial networks
SMART_READER_LITE
LIVE PREVIEW

Generative Adversarial Networks Phillip Isola 9.520 10/17/18 - - PowerPoint PPT Presentation

z N ( ~ 0 , 1) Generative Adversarial Networks Phillip Isola 9.520 10/17/18 Image classification Classifier Fish image X label Y Image generation Generator Fish label Y image X <latexit


slide-1
SLIDE 1

Generative Adversarial Networks

Phillip Isola 9.520 10/17/18

z ∼ N(~ 0, 1)

slide-2
SLIDE 2

Classifier

image X

“Fish”

label Y

Image classification

slide-3
SLIDE 3

Generator

image X

“Fish”

label Y

Image generation

slide-4
SLIDE 4

What’s a generative model?

Useful for lots of problems beyond sampling random images! Model of high-dimensional unobserved variables P(X|Y = y)

<latexit sha1_base64="N2H5Z6Nqpk1692m+ZrozY3y9bk=">AClnicfVHbitRAEO2JtzVedlZfBF8aB3EVGRIR9EVZvKAv4gjORabD0OmpZJrtS+iuqEPMJ/g1vuqH+Dd2ZmdFd8WChlOnTlXJa+U9JgkP3vRmbPnzl/YuRhfunzl6m5/79rE29oJGAurJvl3IOSBsYoUcGscsB1rmCaHz7v4tOP4Ly05j2uK8g0L40spOAYqEX/zmifaY6rvGhmLf1Cj50PLX3y21m3dxf9QTJMNkZPg3QLBmRro8Veb8qWVtQaDArFvZ+nSYVZwx1KoaCNWe2h4uKQlzAP0HANPms2E7X0dmCWtLAuPIN0w/6Z0XDt/VrnQdn16E/GOvJfsXmNxeOskaqEYw4+qioFUVLu/XQpXQgUK0D4MLJ0CsVK+64wLDEOGYvIAzj4E0o/LYCx9G6ew3jrtT8cxuGK9n9Dv1PKM2xMKBQ0sAnYbXmZtkwY51u52nWMAUFMjUBh4OUOVmukLnOa+NwivTk4k+DyYNhmgzTdw8HB8+2R9khN8ktsk9S8ogckNdkRMZEkK/kG/lOfkQ3oqfRy+jVkTqbXOuk78sGv0CUDXMwg=</latexit><latexit sha1_base64="N2H5Z6Nqpk1692m+ZrozY3y9bk=">AClnicfVHbitRAEO2JtzVedlZfBF8aB3EVGRIR9EVZvKAv4gjORabD0OmpZJrtS+iuqEPMJ/g1vuqH+Dd2ZmdFd8WChlOnTlXJa+U9JgkP3vRmbPnzl/YuRhfunzl6m5/79rE29oJGAurJvl3IOSBsYoUcGscsB1rmCaHz7v4tOP4Ly05j2uK8g0L40spOAYqEX/zmifaY6rvGhmLf1Cj50PLX3y21m3dxf9QTJMNkZPg3QLBmRro8Veb8qWVtQaDArFvZ+nSYVZwx1KoaCNWe2h4uKQlzAP0HANPms2E7X0dmCWtLAuPIN0w/6Z0XDt/VrnQdn16E/GOvJfsXmNxeOskaqEYw4+qioFUVLu/XQpXQgUK0D4MLJ0CsVK+64wLDEOGYvIAzj4E0o/LYCx9G6ew3jrtT8cxuGK9n9Dv1PKM2xMKBQ0sAnYbXmZtkwY51u52nWMAUFMjUBh4OUOVmukLnOa+NwivTk4k+DyYNhmgzTdw8HB8+2R9khN8ktsk9S8ogckNdkRMZEkK/kG/lOfkQ3oqfRy+jVkTqbXOuk78sGv0CUDXMwg=</latexit><latexit sha1_base64="N2H5Z6Nqpk1692m+ZrozY3y9bk=">AClnicfVHbitRAEO2JtzVedlZfBF8aB3EVGRIR9EVZvKAv4gjORabD0OmpZJrtS+iuqEPMJ/g1vuqH+Dd2ZmdFd8WChlOnTlXJa+U9JgkP3vRmbPnzl/YuRhfunzl6m5/79rE29oJGAurJvl3IOSBsYoUcGscsB1rmCaHz7v4tOP4Ly05j2uK8g0L40spOAYqEX/zmifaY6rvGhmLf1Cj50PLX3y21m3dxf9QTJMNkZPg3QLBmRro8Veb8qWVtQaDArFvZ+nSYVZwx1KoaCNWe2h4uKQlzAP0HANPms2E7X0dmCWtLAuPIN0w/6Z0XDt/VrnQdn16E/GOvJfsXmNxeOskaqEYw4+qioFUVLu/XQpXQgUK0D4MLJ0CsVK+64wLDEOGYvIAzj4E0o/LYCx9G6ew3jrtT8cxuGK9n9Dv1PKM2xMKBQ0sAnYbXmZtkwY51u52nWMAUFMjUBh4OUOVmukLnOa+NwivTk4k+DyYNhmgzTdw8HB8+2R9khN8ktsk9S8ogckNdkRMZEkK/kG/lOfkQ3oqfRy+jVkTqbXOuk78sGv0CUDXMwg=</latexit><latexit sha1_base64="N2H5Z6Nqpk1692m+ZrozY3y9bk=">AClnicfVHbitRAEO2JtzVedlZfBF8aB3EVGRIR9EVZvKAv4gjORabD0OmpZJrtS+iuqEPMJ/g1vuqH+Dd2ZmdFd8WChlOnTlXJa+U9JgkP3vRmbPnzl/YuRhfunzl6m5/79rE29oJGAurJvl3IOSBsYoUcGscsB1rmCaHz7v4tOP4Ly05j2uK8g0L40spOAYqEX/zmifaY6rvGhmLf1Cj50PLX3y21m3dxf9QTJMNkZPg3QLBmRro8Veb8qWVtQaDArFvZ+nSYVZwx1KoaCNWe2h4uKQlzAP0HANPms2E7X0dmCWtLAuPIN0w/6Z0XDt/VrnQdn16E/GOvJfsXmNxeOskaqEYw4+qioFUVLu/XQpXQgUK0D4MLJ0CsVK+64wLDEOGYvIAzj4E0o/LYCx9G6ew3jrtT8cxuGK9n9Dv1PKM2xMKBQ0sAnYbXmZtkwY51u52nWMAUFMjUBh4OUOVmukLnOa+NwivTk4k+DyYNhmgzTdw8HB8+2R9khN8ktsk9S8ogckNdkRMZEkK/kG/lOfkQ3oqfRy+jVkTqbXOuk78sGv0CUDXMwg=</latexit>
slide-5
SLIDE 5

Gaussian noise

z ∼ N(~ 0, 1)

Generative Model

Synthesized image

x z

slide-6
SLIDE 6

Conditional Generative Model

Synthesized image

x

“bird”

y

slide-7
SLIDE 7

Conditional Generative Model

Synthesized image

x

“A yellow bird on a branch”

y

slide-8
SLIDE 8

Conditional Generative Model

Synthesized image

x

y

slide-9
SLIDE 9

Three perspectives on GANs

  • 1. Structured loss
  • 2. Generative model
  • 3. Domain-level supervision / mapping
slide-10
SLIDE 10

Three perspectives on GANs

  • 1. Structured loss
  • 2. Generative model
  • 3. Domain-level supervision / mapping
slide-11
SLIDE 11

Data prediction problems (“structured prediction”)

Object labeling

[Long et al. 2015, …]

Edge Detection

[Xie et al. 2015, …] [Reed et al. 2014, …]

Text-to-photo

“this small bird has a pink breast and crown…”

Future frame prediction

[Mathieu et al. 2016, …]

slide-12
SLIDE 12

Challenges in data prediction

  • 1. Output is a high-dimensional, structured object
  • 2. Uncertainty in the mapping, many plausible
  • utputs
slide-13
SLIDE 13

Properties of generative models

  • 1. Model high-dimensional, structured output
  • 2. Model uncertainty; a whole distribution of

possible outputs

slide-14
SLIDE 14

F

Objective function (loss) Neural Network

arg min

F Ex,y[L(F(x), y)]

Image-to-Image Translation

Training data

n

  • ,

n

  • ,

n

  • ,

… x y Input x Output y

slide-15
SLIDE 15

“What should I do” “How should I do it?”

arg min

F Ex,y[L(F(x), y)]

Image-to-Image Translation

F

Input x Output y

slide-16
SLIDE 16

Input Output Ground truth

Designing loss functions

slide-17
SLIDE 17
slide-18
SLIDE 18

Color distribution cross-entropy loss with colorfulness enhancing term. Zhang et al. 2016

Designing loss functions

Input Ground truth

slide-19
SLIDE 19
slide-20
SLIDE 20

Designing loss functions Be careful what you wish for!

slide-21
SLIDE 21

Image colorization

Designing loss functions

L2 regression Super-resolution

[Johnson, Alahi, Li, ECCV 2016]

L2 regression

[Zhang, Isola, Efros, ECCV 2016]

slide-22
SLIDE 22

Image colorization

Designing loss functions

Cross entropy objective, with colorfulness term Deep feature covariance matching objective

[Johnson, Alahi, Li, ECCV 2016]

Super-resolution

[Zhang, Isola, Efros, ECCV 2016]

slide-23
SLIDE 23

Universal loss?

)

… …

slide-24
SLIDE 24

)

Generated vs Real

(classifier)

[Goodfellow, Pouget-Abadie, Mirza, Xu, Warde-Farley, Ozair, Courville, Bengio 2014]

“Generative Adversarial Network” (GANs)

Real photos Generated images

slide-25
SLIDE 25

Generator

G x

G(x)

slide-26
SLIDE 26

G tries to synthesize fake images that fool D D tries to identify the fakes

Generator Discriminator

D G x

G(x)

real or fake?

slide-27
SLIDE 27

D G x

Ex,y[ ]

arg max

D

log D(G(x))

fake (0.9)

G(x)

+

log(1 − D(y))

real (0.1)

D y

slide-28
SLIDE 28

min

G tries to synthesize fake images that fool D:

log D(G(x))

Ex,y[

+

log(1 − D(y)) ]

real or fake?

G

arg

D G x

G(x)

slide-29
SLIDE 29

G tries to synthesize fake images that fool the best D:

log D(G(x))

Ex,y[

+

log(1 − D(y))

real or fake?

D G x

G(x)

arg min

G max D

]

slide-30
SLIDE 30

D

Loss Function

G’s perspective: D is a loss function. Rather than being hand-designed, it is learned.

G x

G(x)

slide-31
SLIDE 31

real or fake?

D G x

G(x)

log D(G(x))

Ex,y[

+

log(1 − D(y))

arg min

G max D

]

slide-32
SLIDE 32

real!

D G x

G(x)

log D(G(x))

Ex,y[

+

log(1 − D(y))

arg min

G max D

]

(“Aquarius”)

slide-33
SLIDE 33

real or fake pair ?

D

arg min

G max D

log D(G(x))

+

log(1 − D(y))

Ex,y[ ]

G x

G(x)

slide-34
SLIDE 34

arg min

G max D

Ex,y[ ]

log D(x, G(x)) + log(1 − D(x, y))

real or fake pair ?

D G x

G(x)

slide-35
SLIDE 35

arg min

G max D

Ex,y[ ]

log D(x, G(x)) + log(1 − D(x, y))

D G x

G(x)

fake pair

slide-36
SLIDE 36

arg min

G max D

Ex,y[ ]

log D(x, G(x)) + log(1 − D(x, y))

D G x

G(x)

real pair

slide-37
SLIDE 37

D G x

G(x)

arg min

G max D

Ex,y[ ]

log D(x, G(x)) + log(1 − D(x, y))

real or fake pair ?

slide-38
SLIDE 38

Training Details: Loss function

Conditional GAN

slide-39
SLIDE 39

Training Details: Loss function

Conditional GAN Stable training + fast convergence

G x

G(x)

y

  • [c.f. Pathak et al. CVPR 2016]
slide-40
SLIDE 40

BW → Color

Input Output Input Output Input Output

Data from [Russakovsky et al. 2015]

slide-41
SLIDE 41

#edges2cats [Chris Hesse]

slide-42
SLIDE 42

Ivy Tasi @ivymyt Vitaly Vidmirov @vvid

slide-43
SLIDE 43

Structured Prediction

y

Target Input

x

L(ˆ y, y) = kˆ y yk2

Output

ˆ y

slide-44
SLIDE 44

Structured Prediction

Each pixel treated as independent

Y

i

p(yi|x)

1 Z Y

i,j

p(yi, yj|x)

Models at pairwise configuration

  • f pixels

CRF

slide-45
SLIDE 45

Structured Prediction

Model joint configuration

  • f all pixels

p(y|x)

A GAN, with sufficient capacity, samples from the full joint distribution when perfectly optimized. Most generative models have this property! Give them sufficient capacity and infinite data, and they are the complete solution to prediction problems.

slide-46
SLIDE 46

1/0

y

N pixels N pixels

D

Rather than penalizing if output image looks fake, penalize if each

  • verlapping patch in output looks fake

Shrinking the capacity: Patch Discriminator

[Li & Wand 2016] [Shrivastava et al. 2017] [Isola et al. 2017]

slide-47
SLIDE 47

Labels → Facades

Input 1x1 Discriminator

Data from [Tylecek, 2013]

slide-48
SLIDE 48

Labels → Facades

Input 16x16 Discriminator

Data from [Tylecek, 2013]

slide-49
SLIDE 49

Labels → Facades

Input 70x70 Discriminator

Data from [Tylecek, 2013]

slide-50
SLIDE 50

Labels → Facades

Input Full image Discriminator

Data from [Tylecek, 2013]

slide-51
SLIDE 51

1/0

y

N pixels N pixels

D

Rather than penalizing if output image looks fake, penalize if each

  • verlapping patch in output looks fake

Patch Discriminator

  • Faster, fewer parameters
  • More supervised observations
  • Applies to arbitrarily large images
slide-52
SLIDE 52

Properties of generative models

  • 1. Model high-dimensional, structured output
  • 2. Model uncertainty; a whole distribution of

possible outputs

—> Use a deep net, D, to model output!

slide-53
SLIDE 53

Three perspectives on GANs

  • 1. Structured loss
  • 2. Generative model
  • 3. Domain-level supervision / mapping
slide-54
SLIDE 54

Three perspectives on GANs

  • 1. Structured loss
  • 2. Generative model
  • 3. Domain-level supervision / mapping
slide-55
SLIDE 55

Can we generate images from scratch?

Gaussian noise Synthesized image

z ∼ N(~ 0, 1) z ∼

x

slide-56
SLIDE 56

Generator

G

G(x)

[Goodfellow et al., 2014]

z

slide-57
SLIDE 57

G tries to synthesize fake images that fool D D tries to identify the fakes

Generator Discriminator

D G

G(x)

real or fake?

[Goodfellow et al., 2014]

z

slide-58
SLIDE 58

GANs are implicit generative models

p(x)

“generative model” of the data x

z ⇠ N(0, 1)

Noise distribution

G(z) ⇠ p(x)

Samples from a perfectly optimized GAN are samples from the data distribution

x ⇠ p(x) GAN

Data distribution

G(z)

slide-59
SLIDE 59

Progressive GAN [Karras et al., 2018]

slide-60
SLIDE 60

Progressive GAN [Karras et al., 2018]

slide-61
SLIDE 61

Proof Proposition 1. For G fixed, the optimal discriminator D is D∗

G(x) =

pdata(x) pdata(x) + pg(x)

(G, D) V (G, D) = Z

x

pdata(x) log(D(x))dx + Z

z

pz(z) log(1 D(g(z)))dz = Z

x

pdata(x) log(D(x)) + pg(x) log(1 D(x))dx

Z For any (a, b) 2 R2 \ {0, 0}, the function y ! a log(y) + b log(1 y) achieves its maximum in [0, 1] at

a a+b. The discriminator does not need to be defined outside of Supp(pdata) [ Supp(pg),

concluding the proof.

slide-62
SLIDE 62

C(G) = max

D V (G, D)

=Ex∼pdata[log D∗

G(x)] + Ez∼pz[log(1 D∗ G(G(z)))]

=Ex∼pdata[log D∗

G(x)] + Ex∼pg[log(1 D∗ G(x))]

=Ex∼pdata  log pdata(x) Pdata(x) + pg(x)

  • + Ex∼pg

 log pg(x) pdata(x) + pg(x)

  • 1. The global minimum of the virtual training criterion

is achieved if

G

C(G) = log(4) + KL ✓ pdata

  • pdata + pg

2 ◆ + KL ✓ pg

  • pdata + pg

2 ◆ is the Kullback–Leibler divergence. We recognize in the previous expression

between the model’s distribution and the data generating C(G) = log(4) + 2 · JSD (pdata kpg )

Proof

pg = pdata

<latexit sha1_base64="hOert/YFld76Co4Ny75oRLEwVo=">ACe3icfVHbitRAEO2JtzXedvXRl8YgyLIMiSjuy8KiPvgiruDMLEziUOnUZJrtS+iuqEPIf/iqf+XHCHZmR9BdsaDh1KlT1XUpGyU9pemPUXTl6rXrN3Zuxrdu37l7b3fv/tTb1gmcCKusOy3Bo5IGJyRJ4WnjEHSpcFaevRris0/ovLTmA60bLDTURi6lArUx2ZR8yPeLoKCPrFbpKO043xyDbgoRt7WSxN5rlRWtRkNCgfzLG2o6MCRFAr7OG89NiDOoMZ5gAY0+qLbtN3zx4Gp+NK68AzxDftnRgfa+7Uug1IDrfzF2ED+KzZvaXlYdNI0LaER5x8tW8XJ8mEHvJIOBal1ACcDL1ysQIHgsKm4jh/jWEYh29D4XcNOiDr9rscXK3hSx+Gq/ODAf1PKM1vYUChpMHPwmoNpupyY53u51nR5QqXlKspOkqy3Ml6RbkbvD4Op8guLv4ymD4dZ+k4e/8sOX65PcoOe8gesScsYy/YMXvDTtiECebYV/aNfR/9jJoPzo4l0ajbc4D9pdFz38Bb0vDVA=</latexit><latexit sha1_base64="hOert/YFld76Co4Ny75oRLEwVo=">ACe3icfVHbitRAEO2JtzXedvXRl8YgyLIMiSjuy8KiPvgiruDMLEziUOnUZJrtS+iuqEPIf/iqf+XHCHZmR9BdsaDh1KlT1XUpGyU9pemPUXTl6rXrN3Zuxrdu37l7b3fv/tTb1gmcCKusOy3Bo5IGJyRJ4WnjEHSpcFaevRris0/ovLTmA60bLDTURi6lArUx2ZR8yPeLoKCPrFbpKO043xyDbgoRt7WSxN5rlRWtRkNCgfzLG2o6MCRFAr7OG89NiDOoMZ5gAY0+qLbtN3zx4Gp+NK68AzxDftnRgfa+7Uug1IDrfzF2ED+KzZvaXlYdNI0LaER5x8tW8XJ8mEHvJIOBal1ACcDL1ysQIHgsKm4jh/jWEYh29D4XcNOiDr9rscXK3hSx+Gq/ODAf1PKM1vYUChpMHPwmoNpupyY53u51nR5QqXlKspOkqy3Ml6RbkbvD4Op8guLv4ymD4dZ+k4e/8sOX65PcoOe8gesScsYy/YMXvDTtiECebYV/aNfR/9jJoPzo4l0ajbc4D9pdFz38Bb0vDVA=</latexit><latexit sha1_base64="hOert/YFld76Co4Ny75oRLEwVo=">ACe3icfVHbitRAEO2JtzXedvXRl8YgyLIMiSjuy8KiPvgiruDMLEziUOnUZJrtS+iuqEPIf/iqf+XHCHZmR9BdsaDh1KlT1XUpGyU9pemPUXTl6rXrN3Zuxrdu37l7b3fv/tTb1gmcCKusOy3Bo5IGJyRJ4WnjEHSpcFaevRris0/ovLTmA60bLDTURi6lArUx2ZR8yPeLoKCPrFbpKO043xyDbgoRt7WSxN5rlRWtRkNCgfzLG2o6MCRFAr7OG89NiDOoMZ5gAY0+qLbtN3zx4Gp+NK68AzxDftnRgfa+7Uug1IDrfzF2ED+KzZvaXlYdNI0LaER5x8tW8XJ8mEHvJIOBal1ACcDL1ysQIHgsKm4jh/jWEYh29D4XcNOiDr9rscXK3hSx+Gq/ODAf1PKM1vYUChpMHPwmoNpupyY53u51nR5QqXlKspOkqy3Ml6RbkbvD4Op8guLv4ymD4dZ+k4e/8sOX65PcoOe8gesScsYy/YMXvDTtiECebYV/aNfR/9jJoPzo4l0ajbc4D9pdFz38Bb0vDVA=</latexit><latexit sha1_base64="hOert/YFld76Co4Ny75oRLEwVo=">ACe3icfVHbitRAEO2JtzXedvXRl8YgyLIMiSjuy8KiPvgiruDMLEziUOnUZJrtS+iuqEPIf/iqf+XHCHZmR9BdsaDh1KlT1XUpGyU9pemPUXTl6rXrN3Zuxrdu37l7b3fv/tTb1gmcCKusOy3Bo5IGJyRJ4WnjEHSpcFaevRris0/ovLTmA60bLDTURi6lArUx2ZR8yPeLoKCPrFbpKO043xyDbgoRt7WSxN5rlRWtRkNCgfzLG2o6MCRFAr7OG89NiDOoMZ5gAY0+qLbtN3zx4Gp+NK68AzxDftnRgfa+7Uug1IDrfzF2ED+KzZvaXlYdNI0LaER5x8tW8XJ8mEHvJIOBal1ACcDL1ysQIHgsKm4jh/jWEYh29D4XcNOiDr9rscXK3hSx+Gq/ODAf1PKM1vYUChpMHPwmoNpupyY53u51nR5QqXlKspOkqy3Ml6RbkbvD4Op8guLv4ymD4dZ+k4e/8sOX65PcoOe8gesScsYy/YMXvDTtiECebYV/aNfR/9jJoPzo4l0ajbc4D9pdFz38Bb0vDVA=</latexit>

is the unique global minimizer of the GAN objective.

(

<latexit sha1_base64="OtnNZnbVLP85mY3HmKMLIG9weoU=">ACdHicfVFLaxRBEO4dX3F8JXrUw+AQEJFlRgLxGKIHL2IEdzewPYSa3prZvsxdNeoyzA/Itf4y/wjnu3ZrKCJWNDw1VdfVdejbJT0lGU/RtGNm7du39m5G9+7/+Dho929x1NvWydwIqy7rQEj0oanJAkhaeNQ9Clwlm5ejvEZ1/QeWnNZ1o3WGiojaykArUjB/Lubd2W6ajbONJdBvgUp29rJ2d5oxhdWtBoNCQXez/OsoaIDR1Io7GPemxArKDGeYAGNPqi2/TbJ/uBWSVdeEZSjbsnxkdaO/XugxKDbT0V2MD+a/YvKXqTdFJ07SERlx+VLUqIZsMwycL6VCQWgcAwsnQayKW4EBQWFEc83cYhnH4IRT+2KADsu5lx8HVGr71YbiavxrQ/4TS/BYGFEoa/Cqs1mAWHTfW6X6eFx1XWBFXU3SU5tzJekncDV4fh1PkVxd/HUxfj/NsnH86SI+Ot0fZYU/Zc/aC5eyQHbH37IRNmGArds4u2PfRz+hZlEb7l9JotM15wv6yaPwLaU/Apg=</latexit><latexit sha1_base64="OtnNZnbVLP85mY3HmKMLIG9weoU=">ACdHicfVFLaxRBEO4dX3F8JXrUw+AQEJFlRgLxGKIHL2IEdzewPYSa3prZvsxdNeoyzA/Itf4y/wjnu3ZrKCJWNDw1VdfVdejbJT0lGU/RtGNm7du39m5G9+7/+Dho929x1NvWydwIqy7rQEj0oanJAkhaeNQ9Clwlm5ejvEZ1/QeWnNZ1o3WGiojaykArUjB/Lubd2W6ajbONJdBvgUp29rJ2d5oxhdWtBoNCQXez/OsoaIDR1Io7GPemxArKDGeYAGNPqi2/TbJ/uBWSVdeEZSjbsnxkdaO/XugxKDbT0V2MD+a/YvKXqTdFJ07SERlx+VLUqIZsMwycL6VCQWgcAwsnQayKW4EBQWFEc83cYhnH4IRT+2KADsu5lx8HVGr71YbiavxrQ/4TS/BYGFEoa/Cqs1mAWHTfW6X6eFx1XWBFXU3SU5tzJekncDV4fh1PkVxd/HUxfj/NsnH86SI+Ot0fZYU/Zc/aC5eyQHbH37IRNmGArds4u2PfRz+hZlEb7l9JotM15wv6yaPwLaU/Apg=</latexit><latexit sha1_base64="OtnNZnbVLP85mY3HmKMLIG9weoU=">ACdHicfVFLaxRBEO4dX3F8JXrUw+AQEJFlRgLxGKIHL2IEdzewPYSa3prZvsxdNeoyzA/Itf4y/wjnu3ZrKCJWNDw1VdfVdejbJT0lGU/RtGNm7du39m5G9+7/+Dho929x1NvWydwIqy7rQEj0oanJAkhaeNQ9Clwlm5ejvEZ1/QeWnNZ1o3WGiojaykArUjB/Lubd2W6ajbONJdBvgUp29rJ2d5oxhdWtBoNCQXez/OsoaIDR1Io7GPemxArKDGeYAGNPqi2/TbJ/uBWSVdeEZSjbsnxkdaO/XugxKDbT0V2MD+a/YvKXqTdFJ07SERlx+VLUqIZsMwycL6VCQWgcAwsnQayKW4EBQWFEc83cYhnH4IRT+2KADsu5lx8HVGr71YbiavxrQ/4TS/BYGFEoa/Cqs1mAWHTfW6X6eFx1XWBFXU3SU5tzJekncDV4fh1PkVxd/HUxfj/NsnH86SI+Ot0fZYU/Zc/aC5eyQHbH37IRNmGArds4u2PfRz+hZlEb7l9JotM15wv6yaPwLaU/Apg=</latexit><latexit sha1_base64="OtnNZnbVLP85mY3HmKMLIG9weoU=">ACdHicfVFLaxRBEO4dX3F8JXrUw+AQEJFlRgLxGKIHL2IEdzewPYSa3prZvsxdNeoyzA/Itf4y/wjnu3ZrKCJWNDw1VdfVdejbJT0lGU/RtGNm7du39m5G9+7/+Dho929x1NvWydwIqy7rQEj0oanJAkhaeNQ9Clwlm5ejvEZ1/QeWnNZ1o3WGiojaykArUjB/Lubd2W6ajbONJdBvgUp29rJ2d5oxhdWtBoNCQXez/OsoaIDR1Io7GPemxArKDGeYAGNPqi2/TbJ/uBWSVdeEZSjbsnxkdaO/XugxKDbT0V2MD+a/YvKXqTdFJ07SERlx+VLUqIZsMwycL6VCQWgcAwsnQayKW4EBQWFEc83cYhnH4IRT+2KADsu5lx8HVGr71YbiavxrQ/4TS/BYGFEoa/Cqs1mAWHTfW6X6eFx1XWBFXU3SU5tzJekncDV4fh1PkVxd/HUxfj/NsnH86SI+Ot0fZYU/Zc/aC5eyQHbH37IRNmGArds4u2PfRz+hZlEb7l9JotM15wv6yaPwLaU/Apg=</latexit>

≥ 0, 0 ⇐ ⇒ pg = pdata

<latexit sha1_base64="q9QynLOGAUsJcUg4ZSq4HGXK5U=">ACknicfVFtaxQxEM6tL63rW6t+80vwETKsSuCRCq9YMfFCt4d4XLcsxlZ/dCk+w2mVWPZX+Av8av+lP8N2avJ2grDiQ8eaZmczMotbKU5L8HESXLl+5urV9Lb5+4+at2zu7dya+apzEsax05Y4X4FEri2NSpPG4dghmoXG6ODns/dNP6Lyq7Eda1ZgZK0qlAQK1HxnKEo85ckeF6cN5DzhQhUFr+clfxHuNgeCLqiSUbI2fhGkGzBkGzua7w6mIq9kY9CS1OD9LE1qylpwpKTGLhaNxrkCZQ4C9CQZ+16246/jAwOS8qF4lvmb/jGjBeL8yi6A0QEt/3teT/LNGir2s1bZuiG08qxQ0WhOFe9Hw3PlUJeBQDSqfBXLpfgQFIYByL1xiacfguJH5fowOq3ONWgCsNfOlCc6XY69H/hMr+FgYUlr8LCtjwOatsJUz3SzNWqGxIKEn6GiYCqfKJQnXv7o4rCI9P/iLYPJklCaj9MPT4cGrzVK2X32gD1iKXvGDtgbdsTGTLKv7Bv7zn5E96Ln0cvo8EwaDTYxd9lfFr39BX0IybU=</latexit><latexit sha1_base64="q9QynLOGAUsJcUg4ZSq4HGXK5U=">ACknicfVFtaxQxEM6tL63rW6t+80vwETKsSuCRCq9YMfFCt4d4XLcsxlZ/dCk+w2mVWPZX+Av8av+lP8N2avJ2grDiQ8eaZmczMotbKU5L8HESXLl+5urV9Lb5+4+at2zu7dya+apzEsax05Y4X4FEri2NSpPG4dghmoXG6ODns/dNP6Lyq7Eda1ZgZK0qlAQK1HxnKEo85ckeF6cN5DzhQhUFr+clfxHuNgeCLqiSUbI2fhGkGzBkGzua7w6mIq9kY9CS1OD9LE1qylpwpKTGLhaNxrkCZQ4C9CQZ+16246/jAwOS8qF4lvmb/jGjBeL8yi6A0QEt/3teT/LNGir2s1bZuiG08qxQ0WhOFe9Hw3PlUJeBQDSqfBXLpfgQFIYByL1xiacfguJH5fowOq3ONWgCsNfOlCc6XY69H/hMr+FgYUlr8LCtjwOatsJUz3SzNWqGxIKEn6GiYCqfKJQnXv7o4rCI9P/iLYPJklCaj9MPT4cGrzVK2X32gD1iKXvGDtgbdsTGTLKv7Bv7zn5E96Ln0cvo8EwaDTYxd9lfFr39BX0IybU=</latexit><latexit sha1_base64="q9QynLOGAUsJcUg4ZSq4HGXK5U=">ACknicfVFtaxQxEM6tL63rW6t+80vwETKsSuCRCq9YMfFCt4d4XLcsxlZ/dCk+w2mVWPZX+Av8av+lP8N2avJ2grDiQ8eaZmczMotbKU5L8HESXLl+5urV9Lb5+4+at2zu7dya+apzEsax05Y4X4FEri2NSpPG4dghmoXG6ODns/dNP6Lyq7Eda1ZgZK0qlAQK1HxnKEo85ckeF6cN5DzhQhUFr+clfxHuNgeCLqiSUbI2fhGkGzBkGzua7w6mIq9kY9CS1OD9LE1qylpwpKTGLhaNxrkCZQ4C9CQZ+16246/jAwOS8qF4lvmb/jGjBeL8yi6A0QEt/3teT/LNGir2s1bZuiG08qxQ0WhOFe9Hw3PlUJeBQDSqfBXLpfgQFIYByL1xiacfguJH5fowOq3ONWgCsNfOlCc6XY69H/hMr+FgYUlr8LCtjwOatsJUz3SzNWqGxIKEn6GiYCqfKJQnXv7o4rCI9P/iLYPJklCaj9MPT4cGrzVK2X32gD1iKXvGDtgbdsTGTLKv7Bv7zn5E96Ln0cvo8EwaDTYxd9lfFr39BX0IybU=</latexit><latexit sha1_base64="q9QynLOGAUsJcUg4ZSq4HGXK5U=">ACknicfVFtaxQxEM6tL63rW6t+80vwETKsSuCRCq9YMfFCt4d4XLcsxlZ/dCk+w2mVWPZX+Av8av+lP8N2avJ2grDiQ8eaZmczMotbKU5L8HESXLl+5urV9Lb5+4+at2zu7dya+apzEsax05Y4X4FEri2NSpPG4dghmoXG6ODns/dNP6Lyq7Eda1ZgZK0qlAQK1HxnKEo85ckeF6cN5DzhQhUFr+clfxHuNgeCLqiSUbI2fhGkGzBkGzua7w6mIq9kY9CS1OD9LE1qylpwpKTGLhaNxrkCZQ4C9CQZ+16246/jAwOS8qF4lvmb/jGjBeL8yi6A0QEt/3teT/LNGir2s1bZuiG08qxQ0WhOFe9Hw3PlUJeBQDSqfBXLpfgQFIYByL1xiacfguJH5fowOq3ONWgCsNfOlCc6XY69H/hMr+FgYUlr8LCtjwOatsJUz3SzNWqGxIKEn6GiYCqfKJQnXv7o4rCI9P/iLYPJklCaj9MPT4cGrzVK2X32gD1iKXvGDtgbdsTGTLKv7Bv7zn5E96Ln0cvo8EwaDTYxd9lfFr39BX0IybU=</latexit>

g

slide-63
SLIDE 63

[Theis et al. 2016]

Behavior under model misspecification

slide-64
SLIDE 64

Mode covering versus mode seeking

[Larsen et al. 2016]

slide-65
SLIDE 65

D G

]

arg max

D

G(x)

+

[Goodfellow et al., 2014]

z

real (0.1)

D

x

log (1 − D(x))

Ez,x[

fake (0.9)

log D(G(z))

slide-66
SLIDE 66

G

]

low score

G(x)

+

EBGAN, WGAN, LSGAN, etc

high score

z

Ez,x[

x

−f(G(z))

f(x)

arg max

f

f f

slide-67
SLIDE 67
slide-68
SLIDE 68

Modeling multiple possible outputs

G x

G(x)

slide-69
SLIDE 69

Input Possible outputs ? ? ? ? ?

Modeling multiple possible outputs

slide-70
SLIDE 70

Modeling multiple possible outputs

G x

G(x)

z ∼ N(~ 0, 1)

slide-71
SLIDE 71

InfoGAN [Chen et al. 2016] BiCycleGAN [Zhu et al., NIPS 2017]

G x

z

y

q(z|y)

Encourages z to relay information about the target.

slide-72
SLIDE 72

Labels Randomly generated facades

[BiCycleGAN, Zhu et al., NIPS 2017]

slide-73
SLIDE 73

Properties of generative models

  • 1. Model high-dimensional, structured output
  • 2. Model uncertainty; a whole distribution of

possible outputs

—> Use a deep net, D, to model output! —> Generator is stochastic, learns to match data distribution

slide-74
SLIDE 74

Three perspectives on GANs

  • 1. Structured loss
  • 2. Generative model
  • 3. Domain-level supervision / mapping
slide-75
SLIDE 75

Three perspectives on GANs

  • 1. Structured loss
  • 2. Generative model
  • 3. Domain-level supervision / mapping
slide-76
SLIDE 76

Paired data

slide-77
SLIDE 77

Unpaired data Paired data

Jun-Yan Zhu Taesung Park

slide-78
SLIDE 78

D G x

G(x)

arg min

G max D

Ex,y[ ]

log D(x, G(x)) + log(1 − D(x, y))

real or fake pair ?

slide-79
SLIDE 79

real or fake pair ?

D G x

G(x)

No input-output pairs!

arg min

G max D

Ex,y[ ]

log D(x, G(x)) + log(1 − D(x, y))

slide-80
SLIDE 80

real or fake?

D G x

G(x)

arg min

G max D

log D(G(x))

+

log(1 − D(y))

Ex,y[ ]

Usually loss functions check if output matches a target instance GAN loss checks if output is part of an admissible set

slide-81
SLIDE 81

Gaussian Target distribution

Y

z

slide-82
SLIDE 82

Horses Zebras

X

Y

slide-83
SLIDE 83

Real!

G x

G(x)

D

slide-84
SLIDE 84

G x

G(x)

D

Real too!

Nothing to force output to correspond to input

slide-85
SLIDE 85

[Zhu et al. 2017], [Yi et al. 2017], [Kim et al. 2017]

Cycle-Consistent Adversarial Networks

slide-86
SLIDE 86

Cycle-Consistent Adversarial Networks

slide-87
SLIDE 87

Cycle Consistency Loss

slide-88
SLIDE 88

Cycle Consistency Loss

slide-89
SLIDE 89
slide-90
SLIDE 90
slide-91
SLIDE 91

Failure case

slide-92
SLIDE 92

Failure case

slide-93
SLIDE 93

Why does CycleGAN work?

slide-94
SLIDE 94

Slide credit: Ming-Yu Liu

slide-95
SLIDE 95

Slide credit: Ming-Yu Liu

slide-96
SLIDE 96

Simplicity hypothesis

[Galanti, Wolf, Benaim, 2018]

slide-97
SLIDE 97

Conditional Entropy

“ALICE: Towards Understanding Adversarial Learning for Joint Distribution Matching” [Li et

  • al. NIPS 2017]. Also see [Tiao et al. 2018] “CycleGAN as Approximate Bayesian Inference”

High Conditional Entropy Low Conditional Entropy

Cycle Loss upper bounds Conditional Entropy

slide-98
SLIDE 98

Cycle Loss upper bounds Conditional Entropy

Conditional Entropy

“ALICE: Towards Understanding Adversarial Learning for Joint Distribution Matching” [Li et

  • al. NIPS 2017]. Also see [Tiao et al. 2018] “CycleGAN as Approximate Bayesian Inference”
slide-99
SLIDE 99

[Tzeng et al. 2014]

Domain Adaptation

slide-100
SLIDE 100

Sim2real

Simulated data

,

Real data

, ?

[Richter*, Vineet* et al. 2016] [Krähenbühl et al. 2018]

slide-101
SLIDE 101

CycleGAN

,

Training data

[Hoffman, Tzeng, Park, Zhu, Isola, Saenko, Darrell, Efros, 2018]

slide-102
SLIDE 102

,

Training data

CycleGAN

[Hoffman, Tzeng, Park, Zhu, Isola, Saenko, Darrell, Efros, 2018]

slide-103
SLIDE 103

,

Training data

CycleGAN FCN

[Hoffman, Tzeng, Park, Zhu, Isola, Saenko, Darrell, Efros, 2018]

slide-104
SLIDE 104
  • MRI reconstruction [Quan et al.] arxiv:1709.00753
  • Cardiac MR images from CT [Chartsias et al. 2017]

Input MR Generated CT Ground truth CT

Medical domain adaptation

slide-105
SLIDE 105

Three perspectives on GANs

  • 1. Structured loss
  • 2. Generative model
  • 3. Domain-level supervision / mapping
slide-106
SLIDE 106

Thank you!