TensorFlow Workshop 2018
Introduction to Deep Models
Part I: Classifiers and Generative Networks Nick Winovich
Department of Mathematics Purdue University
July 2018
SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I
Introduction to Deep Models Part I: Classifiers and Generative - - PowerPoint PPT Presentation
TensorFlow Workshop 2018 Introduction to Deep Models Part I: Classifiers and Generative Networks Nick Winovich Department of Mathematics Purdue University July 2018 SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I
TensorFlow Workshop 2018
Introduction to Deep Models
Part I: Classifiers and Generative Networks Nick Winovich
Department of Mathematics Purdue University
July 2018
SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I
Outline
1
Classifier Networks
Classifier Models Softmax Function MNIST Example
2
Generative Networks
Generative Models Adversarial Training Selected Examples
SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I
Outline
1
Classifier Networks
Classifier Models Softmax Function MNIST Example
2
Generative Networks
Generative Models Adversarial Training Selected Examples
SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I
Outline
1
Classifier Networks
Classifier Models Softmax Function MNIST Example
2
Generative Networks
Generative Models Adversarial Training Selected Examples
SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I
Classifier Models
Classifier models aim to assign labels/classes to input data; the inputs to classifier models typically correspond to structured data which is characterized by high-level properties or features.
The MNIST database of handwritten digits comprises images
http://yann.lecun.com/exdb/mnist/
SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I
Automating Feature Discovery
Defining features manually may be feasible for some simple cases, such as handwritten digits (e.g. symmetry, pixel intensity, etc.); for more complex problems, however, finding a sufficient set of features by hand is often impractical.
When the input data is spatially organized, convolution layers
can be used to identify local patterns and extract features
Hidden layers can be used in series to form a hierarchy of
features and expand receptive fields for convolutional layers
SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I
Outline
1
Classifier Networks
Classifier Models Softmax Function MNIST Example
2
Generative Networks
Generative Models Adversarial Training Selected Examples
SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I
The Softmax Function
The softmax function can be used to convert a collection of real values [z1, . . . , zN] into a set of probability values [p1, . . . , pN]. These probabilities sum to 1 by construction, and can be used to select a class using the onehot encoding format. It is defined by: softmax
=
p1
, . . . , exp(zN)
=
pN
by the final network layer without using any activation. In general, the values {pn} should not be computed directly, since efficient fused operations are available with better numerical stability.
SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I
Softmax Cross Entropy
Given a onehot encoded output [y1, . . . , yN] and a set of class probabilities [p1, . . . , pN], we define the cross entropy by: H
N
yn · log(pn) In particular, if the true class corresponds to k (so that yk = 1 and yn = 0 ∀n = k), the expression for the cross entropy reduces to: H
By construction, this value is zero precisely when pk = 1. Note: In TensorFlow, use original logits with the fused operation:
tf.nn.softmax cross entropy with logits v2
SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I
Outline
1
Classifier Networks
Classifier Models Softmax Function MNIST Example
2
Generative Networks
Generative Models Adversarial Training Selected Examples
SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I
Classifier Network: MNIST Example
A simple example of a classifier network trained on the MNIST database is provided in the TensorFlow Examples repository: https://github.com/nw2190/TensorFlow Examples The file Models/01 Classifier.py defines the classifier model and training procedure. This also uses the supplementary files:
utils.py : writes the *.tfrecords files for training/validation flags.py : specifies model hyperparameters/training settings layers.py : defines custom network layers for the model misc.py : defines the parse function and early stopping hook
SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I
Outline
1
Classifier Networks
Classifier Models Softmax Function MNIST Example
2
Generative Networks
Generative Models Adversarial Training Selected Examples
SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I
Outline
1
Classifier Networks
Classifier Models Softmax Function MNIST Example
2
Generative Networks
Generative Models Adversarial Training Selected Examples
SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I
Generative Models
Generative models aim to produce “realistic” examples from a target dataset; more generally, these models aim to sample from the universal distribution which characterizes a given dataset.
For example, we may consider a distribution representative of
all the ways in which digits can be written by hand; a generative model can then be trained to generate images of these digits
SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I
Example: Fully-Connected Generator
SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I
Noise
Example: Convolutional Generator
SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I
Outline
1
Classifier Networks
Classifier Models Softmax Function MNIST Example
2
Generative Networks
Generative Models Adversarial Training Selected Examples
SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I
Distinguishing “Real” from “Fake”
In order to actually train a generative model, however, we must have a way to accurately quantify how “real” the generated data is with respect to the target data distribution. For example, consider assigning a loss to the following “digit”: We may decide to manually assess predictions, having a group of individuals subjectively assign a loss; this is clearly not scalable...
SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I
Discriminators
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A. and Bengio, Y., 2014. Generative adversarial nets. In Advances in neural information processing systems (pp. 2672-2680).
A more practical approach is to instead define an additional network component designed to handle the loss quantification
are trained to distinguish between real and generated data. In particular, discriminators are designed to take structured data as inputs and produce probability values p ∈ (0, 1) reflective of how confident the network is as to whether the given data is real or fake. Discriminators are typically trained to assign values p ≈ 1 to data from the training set and to assign values p ≈ 0 to generated data.
SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I
Example: Fully-Connected Discriminator
SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I
Real or Fake
Example: Convolutional Discriminator
SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I
Adversarial Training
As the discriminator is learning to distinguish between real and generated data, the network’s generator component aims to decieve it by producing increasingly realistic data. To accomplish this, we define separate loss functions for the generator and discriminator components of the network:
The generator loss LG is defined so that the loss is minimized
when the discriminator’s predictions D( y) on generated data y is equal to 1 (i.e. the discriminator believes the data is real).
The discriminator loss LD is defined so that the loss is
minimized when the discriminator assigns the correct labels: i.e. D( y) = 0 for generated data y and D(y) = 1 for real data y
SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I
Defining the Adversarial Loss Functions
Setting y ∼ Real and
y ∼ Generated, we define the losses by:
LG = − log
y)
y)
Introduction to Deep Models : Part I
Adversarial Training Algorithm
SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I
Outline
1
Classifier Networks
Classifier Models Softmax Function MNIST Example
2
Generative Networks
Generative Models Adversarial Training Selected Examples
SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I
Generative Adversarial Network: MNIST Example
A simple example of a generative network trained on the MNIST database is provided in the TensorFlow Examples repository: https://github.com/nw2190/TensorFlow Examples The file Models/02 GAN.py defines the classifier model and training procedure. A wide range of additional generative models, with full TensorFlow implementations, are also available at:
https://github.com/hwalsuklee/tensorflow-generative-model-collections
SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I
Face Generation: Progressive Growing of GANs
Progressive Growing of GANs
Progressive Growing of GANs - Face Generation
Link
https://www.youtube.com/watch?v=XOxxPcy5Gr4
SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I
Style Transfer: CycleGAN
CycleGAN
CycleGAN - Style Transfer for Video
Link
https://www.youtube.com/watch?v=9reHvktowLY
SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I