Conditional Generative Adversarial Networks (and a brief look at - - PowerPoint PPT Presentation
Conditional Generative Adversarial Networks (and a brief look at - - PowerPoint PPT Presentation
Conditional Generative Adversarial Networks (and a brief look at image-to-image translation) Final Presentation Peter Bromley Generative Models What is generative modeling? Data: Samples from high dimensional probability distribution p real
Generative Models
What is generative modeling? Data: Samples from high dimensional probability distribution preal Model: Approximate preal with learned distribution pfake
https://blog.openai.com/generative-models/
Generative Models - cont.
https://blog.openai.com/generative-models/ https://www.statlect.com/probability-distributions/normal-distribution
pfake
Mapped through function w/ learned weights
preal
Compared via loss function
Why do it? Data augmentation Similar to human ability of imagining an image Can map from noise vector to high dimensional probability distribution
Generative Adversarial Networks
z ~ 𝒪(0, 1)
Generator
Sample from pfake Sample from preal
Discriminator
preal or pfake?
(1) (0) Gan loss:
Nash Equilibrium: Discriminator guesses completely randomly (accuracy = 0.5)
Goodfellow et. al, https://arxiv.org/abs/1406.2661
DCGAN (Deep Convolutional GAN) - Model
http://gluon.mxnet.io/chapter14_generative-adversarial-networks/dcgan.html
Sample from pfake z ~ 𝒪(0, 1)
Fully Connected
pfake preal
Conv Conv Conv Conv
Fake or Real?
https://www.safaribooksonline.com/library/view/deep-learning-with/9781787128422/abc1dd74-9e57-4f89-82a5-3014fc35b664.xhtml
Conditional GANs
Guide the learning process by conditioning the network on some label y New loss function:
https://arxiv.org/abs/1610.09585 https://arxiv.org/pdf/1411.1784.pdf
*(for cGAN model specifically)
*
2N-GAN
c = 1 (real, class 1) c = 2 (real, class 2) c = n + 2 (fake, class 2)
“n” is number of classes
As of 5/10/18, this model has not been thoroughly explored in the literature
https://arxiv.org/pdf/1606.03657.pdf
Project Goals
Compare GAN models (specifically “Conditional” GANs) on toy datasets: Conceptually (Pros and Cons) Subjectively (My evaluation of images) Empirically (Qualitative metric) Briefly look into image-to-image translation: Apply to a novel image domain Experiment with “2N label” model
DCGAN (Not a conditional model) - Results
Target Images (preal) Generated Images (pfake)
DCGAN makes realistic looking images, but has no notion of class type
Conditional Models - MNIST / Fashion MNIST
cDCGAN ACGAN 2NGAN
Conditional Models - MNIST / Fashion MNIST (Loss Plots)
cDCGAN ACGAN 2NGAN
MNIST: Fashion:
Conditional Models - MNIST (In-Class Variation)
cDCGAN ACGAN 2NGAN Real
ACGAN and 2NGAN seem to underfit the “ones” distribution, but not the “sevens” cDCGAN captures more variety, but produces more “wrong” numbers
Note: “Ones” variant seems to be far less common than the “sevens” variant in the real data
Real CIFAR10 Data (for reference)
plane car bird cat deer dog frog horse ship truck
Real
Conditional Models - CIFAR10 (Results)
ACGAN cDCGAN 2NGAN
plane car bird cat deer dog frog horse ship truck
Conditional Models - CIFAR10 (Loss and Inception Score)
cDCGAN ACGAN 2NGAN
Inception Score: Use a pretrained Inception net to classify generated samples Low entropy for correct labels of samples (P(y|x)) High entropy for whole generated distribution Higher is better
Inception Scores:
Mean: 5.3405895, Std: 0.12261291 Mean: 3.5518444, Std: 0.068273105 Mean: 6.355787, Std: 0.1747299
Conditional Models - InfoGAN
InfoGAN - Information Maximizing GAN for Unsupervised Learning
Input predictable latent code as well as noise Maximize mutual information btw code and output Input: z ~ 𝒪(0, 1), c_cat ~ Categorical(0, 10), c1 and c2 ~ Unif(-1, 1)
c_cat
c1 c2 c1 c2
Conditional Models - More InfoGAN
c1 c2
Higher Resolution Dataset: Cat Faces
DCGAN InfoGAN
https://github.com/AlexiaJM/Deep-learning-with-cats
Linear Interpolation - Cats
Conditional Model Comparisons - Conclusions
2NGAN: Most stable training, highest Inception Score cDCGAN: Unstable training, but high in-class variability ACGAN: Stable training, but lowest Inception Score InfoGAN: Picks up interesting features in an unsupervised manner
A brief look into Image-to-Image Translation with CycleGAN
Image-to-Image Translation: Translate image from domain X to domain Y Examples: Black-and-white photos to color Summer landscape images to winter
CycleGAN:
https://hardikbansal.github.io/CycleGANBlog/
Cycle-Consistent Loss
Model works with unpaired training data
https://junyanz.github.io/CycleGAN/
CycleGAN Results: Monet2Photo
Real Photo (Real A) Fake Monet (Fake B) Real Monet (Real B) Fake Photo (Fake A)
CycleGAN Results: BrownBear2Panda (My experiment)
Overall Conclusions, Open Problems, and Future Work
GANs are very hard to empirically evaluate Difficult to pick up global coherence in data sets with lots of image variety Throwing labels at a model does not always make it better In the future: Further research into 2NGAN model Experiment with more variables in InfoGAN More in depth into image-to-image translation Text-to-Image Synthesis
Miscellaneous Failures
Mode Collapse at 70th Epoch “Animal Faces”
“Cartoon2Celebrity”
Demonic dogs