Conditional Generative Adversarial Networks (and a brief look at - - PowerPoint PPT Presentation

conditional generative adversarial networks
SMART_READER_LITE
LIVE PREVIEW

Conditional Generative Adversarial Networks (and a brief look at - - PowerPoint PPT Presentation

Conditional Generative Adversarial Networks (and a brief look at image-to-image translation) Final Presentation Peter Bromley Generative Models What is generative modeling? Data: Samples from high dimensional probability distribution p real


slide-1
SLIDE 1

Conditional Generative Adversarial Networks

(and a brief look at image-to-image translation)

Peter Bromley

Final Presentation

slide-2
SLIDE 2

Generative Models

What is generative modeling? Data: Samples from high dimensional probability distribution preal Model: Approximate preal with learned distribution pfake

https://blog.openai.com/generative-models/

slide-3
SLIDE 3

Generative Models - cont.

https://blog.openai.com/generative-models/ https://www.statlect.com/probability-distributions/normal-distribution

pfake

Mapped through function w/ learned weights

preal

Compared via loss function

Why do it? Data augmentation Similar to human ability of imagining an image Can map from noise vector to high dimensional probability distribution

slide-4
SLIDE 4

Generative Adversarial Networks

z ~ 𝒪(0, 1)

Generator

Sample from pfake Sample from preal

Discriminator

preal or pfake?

(1) (0) Gan loss:

Nash Equilibrium: Discriminator guesses completely randomly (accuracy = 0.5)

Goodfellow et. al, https://arxiv.org/abs/1406.2661

slide-5
SLIDE 5

DCGAN (Deep Convolutional GAN) - Model

http://gluon.mxnet.io/chapter14_generative-adversarial-networks/dcgan.html

Sample from pfake z ~ 𝒪(0, 1)

Fully Connected

pfake preal

Conv Conv Conv Conv

Fake or Real?

https://www.safaribooksonline.com/library/view/deep-learning-with/9781787128422/abc1dd74-9e57-4f89-82a5-3014fc35b664.xhtml

slide-6
SLIDE 6

Conditional GANs

Guide the learning process by conditioning the network on some label y New loss function:

https://arxiv.org/abs/1610.09585 https://arxiv.org/pdf/1411.1784.pdf

*(for cGAN model specifically)

*

2N-GAN

c = 1 (real, class 1) c = 2 (real, class 2) c = n + 2 (fake, class 2)

“n” is number of classes

As of 5/10/18, this model has not been thoroughly explored in the literature

https://arxiv.org/pdf/1606.03657.pdf

slide-7
SLIDE 7

Project Goals

Compare GAN models (specifically “Conditional” GANs) on toy datasets: Conceptually (Pros and Cons) Subjectively (My evaluation of images) Empirically (Qualitative metric) Briefly look into image-to-image translation: Apply to a novel image domain Experiment with “2N label” model

slide-8
SLIDE 8

DCGAN (Not a conditional model) - Results

Target Images (preal) Generated Images (pfake)

DCGAN makes realistic looking images, but has no notion of class type

slide-9
SLIDE 9

Conditional Models - MNIST / Fashion MNIST

cDCGAN ACGAN 2NGAN

slide-10
SLIDE 10

Conditional Models - MNIST / Fashion MNIST (Loss Plots)

cDCGAN ACGAN 2NGAN

MNIST: Fashion:

slide-11
SLIDE 11

Conditional Models - MNIST (In-Class Variation)

cDCGAN ACGAN 2NGAN Real

ACGAN and 2NGAN seem to underfit the “ones” distribution, but not the “sevens” cDCGAN captures more variety, but produces more “wrong” numbers

Note: “Ones” variant seems to be far less common than the “sevens” variant in the real data

slide-12
SLIDE 12

Real CIFAR10 Data (for reference)

plane car bird cat deer dog frog horse ship truck

Real

slide-13
SLIDE 13

Conditional Models - CIFAR10 (Results)

ACGAN cDCGAN 2NGAN

plane car bird cat deer dog frog horse ship truck

slide-14
SLIDE 14

Conditional Models - CIFAR10 (Loss and Inception Score)

cDCGAN ACGAN 2NGAN

Inception Score: Use a pretrained Inception net to classify generated samples Low entropy for correct labels of samples (P(y|x)) High entropy for whole generated distribution Higher is better

Inception Scores:

Mean: 5.3405895, Std: 0.12261291 Mean: 3.5518444, Std: 0.068273105 Mean: 6.355787, Std: 0.1747299

slide-15
SLIDE 15

Conditional Models - InfoGAN

InfoGAN - Information Maximizing GAN for Unsupervised Learning

Input predictable latent code as well as noise Maximize mutual information btw code and output Input: z ~ 𝒪(0, 1), c_cat ~ Categorical(0, 10), c1 and c2 ~ Unif(-1, 1)

c_cat

c1 c2 c1 c2

slide-16
SLIDE 16

Conditional Models - More InfoGAN

c1 c2

slide-17
SLIDE 17

Higher Resolution Dataset: Cat Faces

DCGAN InfoGAN

https://github.com/AlexiaJM/Deep-learning-with-cats

slide-18
SLIDE 18

Linear Interpolation - Cats

slide-19
SLIDE 19

Conditional Model Comparisons - Conclusions

2NGAN: Most stable training, highest Inception Score cDCGAN: Unstable training, but high in-class variability ACGAN: Stable training, but lowest Inception Score InfoGAN: Picks up interesting features in an unsupervised manner

slide-20
SLIDE 20

A brief look into Image-to-Image Translation with CycleGAN

Image-to-Image Translation: Translate image from domain X to domain Y Examples: Black-and-white photos to color Summer landscape images to winter

CycleGAN:

https://hardikbansal.github.io/CycleGANBlog/

Cycle-Consistent Loss

Model works with unpaired training data

https://junyanz.github.io/CycleGAN/

slide-21
SLIDE 21

CycleGAN Results: Monet2Photo

Real Photo (Real A) Fake Monet (Fake B) Real Monet (Real B) Fake Photo (Fake A)

slide-22
SLIDE 22

CycleGAN Results: BrownBear2Panda (My experiment)

slide-23
SLIDE 23

Overall Conclusions, Open Problems, and Future Work

GANs are very hard to empirically evaluate Difficult to pick up global coherence in data sets with lots of image variety Throwing labels at a model does not always make it better In the future: Further research into 2NGAN model Experiment with more variables in InfoGAN More in depth into image-to-image translation Text-to-Image Synthesis

slide-24
SLIDE 24

Miscellaneous Failures

Mode Collapse at 70th Epoch “Animal Faces”

“Cartoon2Celebrity”

Demonic dogs

slide-25
SLIDE 25

Thanks for watching!