unsupervised learning
play

Unsupervised Learning There is no direct ground truth for the - PowerPoint PPT Presentation

Unsupervised Learning There is no direct ground truth for the quantity of interest Autoencoders Variational Autoencoders (VAEs) Generative Adversarial Networks (GANs) 1 Autoencoders Goal: Meaningful features that capture the main


  1. Unsupervised Learning • There is no direct ground truth for the quantity of interest • Autoencoders • Variational Autoencoders (VAEs) • Generative Adversarial Networks (GANs) 1

  2. Autoencoders Goal: Meaningful features that capture the main factors of variation in the dataset • These are good for classification, clustering, exploration, generation, … • We have no ground truth for them Features Encoder Input data 2 Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n

  3. Autoencoders Goal: Meaningful features that capture the main factors of variation Features that can be used to reconstruct the image Decoder L2 Loss function: Features (Latent variables) Encoder Input data 3 Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n

  4. Autoencoders Linear Transformation for Encoder and Decoder give result close to PCA Deeper networks give better reconstructions, 
 since basis can be non-linear Original Autoencoder PCA 4 Image Credit: Reducing the Dimensionality of Data with Neural Networks, . Hinton and Salakhutdinov

  5. Example: Document Word Prob. → 2D Code PCA-based Autoencoder 5 Image Credit: Reducing the Dimensionality of Data with Neural Networks, Hinton and Salakhutdinov

  6. Example: Semi-Supervised Classification • Many images, but few ground truth labels supervised fine-tuning start unsupervised train classification network on labeled images train autoencoder on many images Loss function (Softmax, etc) Predicted Label GT Label Decoder Classifier L2 Loss function: Features Features (Latent Variables) Encoder Encoder Input data 6 Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n

  7. Autoencoder geometry.cs.ucl.ac.uk/creativeai 7

  8. Generative Models • Assumption: the dataset are samples from an unknown distribution • Goal: create a new sample from that is not in the dataset … ? Dataset Generated Image credit: Progressive Growing of GANs for Improved Quality, Stability, and Variation, Karras et al. 8

  9. Generative Models • Assumption: the dataset are samples from an unknown distribution • Goal: create a new sample from that is not in the dataset … Dataset Generated Image credit: Progressive Growing of GANs for Improved Quality, Stability, and Variation, Karras et al. 9

  10. Generative Models Generator with parameters known and easy to sample from 10

  11. Generative Models How to measure similarity of and ? 1) Likelihood of data in Generator with Variational Autoencoders (VAEs) parameters 2) Adversarial game: Discriminator distinguishes Generator makes it vs known and and hard to distinguish easy to sample from Generative Adversarial Networks (GANs) 11

  12. Autoencoders as Generative Models? • A trained decoder transforms some features to approximate samples from • What happens if we pick a random ? Decoder = Generator? • We do not know the distribution of features that decode to likely samples random Feature space / latent space 12 Image Credit: Reducing the Dimensionality of Data with Neural Networks , Hinton and Salakhutdinov

  13. Variational Autoencoders (VAEs) • Pick a parametric distribution for features • The generator maps to an image distribution (where are parameters) Generator with parameters • Train the generator to maximize the likelihood of the data sample in : 13

  14. Outputting a Distribution Bernoulli distribution Normal distribution Generator with Generator with parameters parameters sample sample 14

  15. Variational Autoencoders (VAEs) • Pick a parametric distribution for features • The generator maps to an image distribution (where are parameters) Generator with parameters • Train the generator to maximize the likelihood of the data sample in : 15

  16. Variational Autoencoders (VAEs): 
 Naïve Sampling (Monte-Carlo) • Approximate Integral with Monte-Carlo in each iteration • SGD approximates the sum over data Maximum likelihood of data in generated distribution: 16

  17. Variational Autoencoders (VAEs): 
 Naïve Sampling (Monte-Carlo) • Approximate Integral with Monte-Carlo in each iteration • SGD approximates the expectancy over data Loss function: Generator with Random from dataset parameters sample 17

  18. Variational Autoencoders (VAEs): 
 Naïve Sampling (Monte-Carlo) • Approximate Integral with Monte-Carlo in each iteration • SGD approximates the expectancy over data Loss function: • Only few map close to a given • Very expensive, or very inaccurate (depending on sample count) Generator with Random from dataset parameters sample with non-zero 18

  19. Variational Autoencoders (VAEs): 
 The Encoder • During training, another network can learn a distribution of good for a given Loss function: • should be much smaller than • A single sample is good enough Generator with parameters sample Encoder with parameters 19

  20. Variational Autoencoders (VAEs): 
 The Encoder • Can we still easily sample a new ? Loss function: • Need to make sure approximates • Regularize with KL-divergence • Negative loss can be shown to be a lower bound for the likelihood, Generator with parameters and equivalent if sample Encoder with parameters 20

  21. Reparameterization Trick Example when : , where Generator with parameters sample Backprop Backprop? sample Encoder with Encoder with parameters parameters Does not depend on parameters 21

  22. Feature Space of Autoencoders vs. VAEs Autoencoder VAE SIGGRAPH Asia Course CreativeAI: Deep Learning for Graphics 22

  23. Generating Data MNIST Frey Faces sample Generator with parameters sample 23 Image Credit: Auto-Encoding Variational Bayes , Kingma and Welling

  24. 
 VAE on MNIST 
 https://www.siarez.com/projects/variational-autoencoder 24

  25. 
 Variational Autoencoder 
 geometry.cs.ucl.ac.uk/creativeai 
 25

  26. Generative Adversarial Networks Player 1: generator Player 2: discriminator real/fake Scores if discriminator Scores if it can distinguish can’t distinguish output between real and fake from real image from dataset 26

  27. Generative Models How to measure similarity of and ? 1) Likelihood of data in Generator with Variational Autoencoders (VAEs) parameters 2) Adversarial game: Discriminator distinguishes Generator makes it vs known and and hard to distinguish easy to sample from Generative Adversarial Networks (GANs) 27

  28. Why Adversarial? • If discriminator approximates : • at maximum of has lowest loss • Optimal has single mode at , small variance : discriminator with parameters : generator with parameters sample 28 Image Credit: How (not) to Train your Generative Model: Scheduled Sampling, Likelihood, Adversary? , Ferenc Huszár

  29. Why Adversarial? • For GANs, the discriminator instead approximates: : discriminator depends on the generator with parameters : generator with parameters sample 29 Image Credit: How (not) to Train your Generative Model: Scheduled Sampling, Likelihood, Adversary? , Ferenc Huszár

  30. Why Adversarial? VAEs: Maximize likelihood of GANs: Maximize likelihood of generator samples in Adversarial game data samples in approximate 30 Image Credit: How (not) to Train your Generative Model: Scheduled Sampling, Likelihood, Adversary? , Ferenc Huszár

  31. Why Adversarial? VAEs: Maximize likelihood of GANs: Maximize likelihood of generator samples in Adversarial game data samples in approximate 31 Image Credit: How (not) to Train your Generative Model: Scheduled Sampling, Likelihood, Adversary? , Ferenc Huszár

  32. GAN Objective probability that is not fake fake/real classification loss (BCE): :discriminator Discriminator objective: :generator Generator objective: sample 32

  33. Non-saturating Heuristic Generator loss is negative binary cross-entropy: poor convergence Negative BCE 33 Image Credit: NIPS 2016 Tutorial: Generative Adversarial Networks, Ian Goodfellow

  34. Non-saturating Heuristic Generator loss is negative binary cross-entropy: poor convergence Flip target class instead of flipping the sign for generator loss: good convergence – like BCE Negative BCE BCE with flipped target 34 Image Credit: NIPS 2016 Tutorial: Generative Adversarial Networks, Ian Goodfellow

  35. GAN Training Loss: Loss: Discriminator training Generator training :discriminator :discriminator from dataset :generator Interleave in each training step sample 35

  36. DCGAN • First paper to successfully use CNNs with GANs • Due to using novel components (at that time) like batch norm., ReLUs, etc. 36 Image Credit: Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , Radford et al.

  37. 
 Generative Adversarial Network 
 geometry.cs.ucl.ac.uk/creativeai 37

  38. Conditional GANs (CGANs) • ≈ learn a mapping between images from example pairs • Approximate sampling from a conditional distribution 38 Image Credit: Image-to-Image Translation with Conditional Adversarial Nets , Isola et al.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend