CSCE 496/896 Lecture 5: Stephen Scott Autoencoders Introduction - PowerPoint PPT Presentation

CSCE 496/896 Lecture 5: Autoencoders CSCE 496/896 Lecture 5: Stephen Scott Autoencoders Introduction Basic Idea Stacked AE Stephen Scott Transposed Convolutions Denoising AE (Adapted from Eleanor Quint and Ian Goodfellow) Sparse AE Contractive AE Variational AE t-SNE GAN sscott@cse.unl.edu 1 / 41

Introduction CSCE 496/896 Lecture 5: Autoencoders Stephen Scott Autoencoding is training a network to replicate its Introduction input to its output Basic Idea Applications: Stacked AE Transposed Unlabeled pre-training for semi-supervised learning Convolutions Learning embeddings to support information retrieval Denoising AE Generation of new instances similar to those in the Sparse AE training set Contractive Data compression AE Variational AE t-SNE GAN 2 / 41

Outline CSCE 496/896 Lecture 5: Autoencoders Stephen Scott Basic idea Introduction Stacking Basic Idea Stacked AE Types of autoencoders Transposed Denoising Convolutions Sparse Denoising AE Contractive Sparse AE Variational Contractive Generative adversarial networks AE Variational AE t-SNE GAN 3 / 41

Basic Idea CSCE 496/896 Lecture 5: Autoencoders Stephen Scott Introduction Basic Idea Stacked AE Transposed Convolutions Denoising AE Sparse AE Contractive Sigmoid activation functions, 5000 training epochs, AE square loss, no regularization Variational AE t-SNE What’s special about the hidden layer outputs? GAN 4 / 41

Basic Idea CSCE 496/896 An autoencoder is a network trained to learn the Lecture 5: Autoencoders identity function: output = input Stephen Scott Subnetwork called Introduction encoder f ( · ) maps input Basic Idea to an embedded Stacked AE Transposed representation Convolutions Subnetwork called Denoising AE decoder g ( · ) maps back Sparse AE Contractive to input space AE Variational AE Can be thought of as lossy compression of input t-SNE GAN Need to identify the important attributes of inputs to reproduce faithfully 5 / 41

Basic Idea CSCE 496/896 Lecture 5: Autoencoders Stephen Scott General types of autoencoders based on size of hidden layer Introduction Undercomplete autoencoders have hidden layer size Basic Idea smaller than input layer size Stacked AE ⇒ Dimension of embedded space lower than that of input Transposed Convolutions space ⇒ Cannot simply memorize training instances Denoising AE Overcomplete autoencoders have much larger hidden Sparse AE layer sizes Contractive AE ⇒ Regularize to avoid overfitting, e.g., enforce a sparsity Variational AE constraint t-SNE GAN 6 / 41

Basic Idea Example: Principal Component Analysis CSCE 496/896 Lecture 5: Autoencoders Stephen Scott Introduction Basic Idea Stacked AE Transposed Convolutions Denoising AE Sparse AE Contractive A 3-2-3 autoencoder with linear units and square loss AE performs principal component analysis : Find linear Variational AE t-SNE transformation of data to maximize variance GAN 7 / 41

Stacked Autoencoders CSCE 496/896 Lecture 5: Autoencoders A stacked Stephen Scott autoencoder Introduction has multiple Basic Idea hidden layers Stacked AE Transposed Convolutions Denoising AE Can share parameters to reduce their number by exploiting symmetry: W 4 = W ⊤ 1 and W 3 = W ⊤ Sparse AE 2 Contractive AE Variational AE weights1 = tf.Variable(weights1_init, dtype=tf.float32, name="weights1") weights2 = tf.Variable(weights2_init, dtype=tf.float32, name="weights2") t-SNE weights3 = tf.transpose(weights2, name="weights3") # shared weights weights4 = tf.transpose(weights1, name="weights4") # shared weights GAN 8 / 41

Stacked Autoencoders Incremental Training CSCE 496/896 Lecture 5: Autoencoders Stephen Scott Introduction Basic Idea Stacked AE Transposed Convolutions Denoising AE Sparse AE Can simplify training by starting with single hidden Contractive layer H 1 AE Variational AE Then, train a second AE to mimic the output of H 1 t-SNE Insert this into first network GAN Can build by using H 1 ’s output as training set for Phase 2 9 / 41

Stacked Autoencoders Incremental Training (Single TF Graph) CSCE 496/896 Lecture 5: Autoencoders Stephen Scott Introduction Basic Idea Stacked AE Transposed Convolutions Denoising AE Sparse AE Contractive AE Variational AE Previous approach requires multiple TensorFlow graphs t-SNE GAN Can instead train both phases in a single graph: First left side, then right 10 / 41

Stacked Autoencoders Visualization CSCE Input MNIST Digit Network Output 496/896 Lecture 5: Autoencoders Stephen Scott Introduction Basic Idea Stacked AE Transposed Convolutions Denoising AE Sparse AE Contractive AE Weights (features selected) for five nodes from H 1 : Variational AE t-SNE GAN 11 / 41

Stacked Autoencoders Semi-Supervised Learning CSCE 496/896 Lecture 5: Autoencoders Stephen Scott Introduction Basic Idea Stacked AE Transposed Convolutions Denoising AE Sparse AE Contractive AE Variational AE t-SNE Can pre-train network with unlabeled data GAN ⇒ learn useful features and then train “logic” of dense layer with labeled data 12 / 41

Transfer Learning from Trained Classifier CSCE 496/896 Lecture 5: Can also Autoencoders Stephen Scott transfer from a classifier Introduction trained on Basic Idea different task, Stacked AE e.g., transfer a Transposed Convolutions GoogleNet Denoising AE architecture to Sparse AE ultrasound Contractive AE classification Variational AE t-SNE Often choose existing one from a model zoo GAN 13 / 41

Transposed Convolutions CSCE 496/896 Lecture 5: Autoencoders Stephen Scott Introduction What if some encoder layers are convolutional? How to Basic Idea upsample to original resolution? Stacked AE Can use, e.g., linear interpolation , bilinear Transposed Convolutions interpolation , etc. Denoising AE Or, transposed convolution , e.g., Sparse AE tf.layers.conv2d transpose Contractive AE Variational AE t-SNE GAN 14 / 41

Transposed Convolutions (2) CSCE 496/896 Lecture 5: Autoencoders Consider this example convolution Stephen Scott Introduction Basic Idea Stacked AE Transposed Convolutions Denoising AE Sparse AE Contractive AE Variational AE t-SNE GAN 15 / 41

Transposed Convolutions (3) CSCE 496/896 Lecture 5: An alternative way of representing the kernel Autoencoders Stephen Scott Introduction Basic Idea Stacked AE Transposed Convolutions Denoising AE Sparse AE Contractive AE Variational AE t-SNE GAN 16 / 41

Transposed Convolutions (4) CSCE This representation works with matrix multiplication on 496/896 Lecture 5: flattened input: Autoencoders Stephen Scott Introduction Basic Idea Stacked AE Transposed Convolutions Denoising AE Sparse AE Contractive AE Variational AE t-SNE GAN 17 / 41

Transposed Convolutions (5) Transpose kernel, multiply by flat 2 × 2 to get flat 4 × 4 CSCE 496/896 Lecture 5: Autoencoders Stephen Scott Introduction Basic Idea Stacked AE Transposed Convolutions Denoising AE Sparse AE Contractive AE Variational AE t-SNE GAN 18 / 41

Denoising Autoencoders Vincent et al. (2010) CSCE 496/896 Lecture 5: Autoencoders Stephen Scott Can train an autoencoder to learn to denoise input by giving input corrupted instance ˜ x and targeting Introduction uncorrupted instance x Basic Idea Stacked AE Example noise models: Transposed x = x + z , where z ∼ N ( 0 , σ 2 I ) Gaussian noise: ˜ Convolutions Masking noise: zero out some fraction ν of Denoising AE components of x Sparse AE Salt-and-pepper noise: choose some fraction ν of Contractive AE components of x and set each to its min or max value (equally likely) Variational AE t-SNE GAN 19 / 41

Denoising Autoencoders CSCE 496/896 Lecture 5: Autoencoders Stephen Scott Introduction Basic Idea Stacked AE Transposed Convolutions Denoising AE Sparse AE Contractive AE Variational AE t-SNE GAN 20 / 41

Denoising Autoencoders Example CSCE 496/896 Lecture 5: Autoencoders Stephen Scott Introduction Basic Idea Stacked AE Transposed Convolutions Denoising AE Sparse AE Contractive AE Variational AE t-SNE GAN 21 / 41

Denoising Autoencoders CSCE How does it work? 496/896 Lecture 5: Even though, e.g., MNIST data are in a Autoencoders 784-dimensional space, they lie on a low-dimensional Stephen Scott manifold that captures their most important features Introduction Corruption process moves instance x off of manifold Basic Idea Encoder f θ and decoder g θ ′ are trained to project ˜ x back Stacked AE onto manifold Transposed Convolutions Denoising AE Sparse AE Contractive AE Variational AE t-SNE GAN 22 / 41

CSCE 496/896 Lecture 5: Stephen Scott Autoencoders Introduction - PowerPoint PPT Presentation

CSCE 496/896 Lecture 5: Autoencoders CSCE 496/896 Lecture 5: Stephen Scott Autoencoders Introduction Basic Idea Stacked AE Stephen Scott Transposed Convolutions Denoising AE (Adapted from Eleanor Quint and Ian Goodfellow) Sparse AE

Why Are We Here? CSCE CSCE 496/896 496/896 Lecture 10: Lecture 10: CSCE 496/896 Lecture 10:

Introduction CSCE CSCE 496/896 496/896 Lecture 6: Lecture 6: Recurrent Recurrent CSCE

Introduction CSCE CSCE 496/896 496/896 Lecture 7: Lecture 7: Reinforcement Reinforcement

Introduction CSCE CSCE 496/896 496/896 Lecture 9: Lecture 9: word2vec and word2vec and To

Introduction Supervised Learning CSCE CSCE 496/896 496/896 Lecture 2: Lecture 2: Basic

Welcome to CSCE 496/896: Deep Learning! Welcome to CSCE 496/896: Deep Learning! Please check

CSCE 496/896 Lecture 11: Structured Prediction and Structured Prediction and Probabilistic

CSCE 496/896 Lecture 6: Architectures Stephen Scott Recurrent Architectures Introduction Basic

CSCE 496/896 Lecture 5: Stephen Scott Autoencoders Introduction Basic Idea Stacked AE Stephen

CSCE 496/896 Lecture 8: Good Research Talk How to Give a Good Research Talk Stephen Scott

CSCE 496/896 Lecture 7: Learning Stephen Scott Reinforcement Learning Introduction MDPs Q

CSCE 625: Artificial Intelligence Dr. Dylan Shell 1 Shell CSCE 625 TAMU 2 Shell CSCE 625 TAMU

CSCE 625: Artificial Intelligence Dr. Dylan Shell 1 Shell CSCE 625 TAMU CSCE 625: Artificial

Introduction CSCE 970 CSCE 970 Lecture 3: Lecture 3: Regularization Regularization CSCE 970

Introduction CSCE CSCE 471/871 471/871 Lecture 6: Lecture 6: Multiple Multiple CSCE

Outline CSCE CSCE 471/871 471/871 Lecture 5: Lecture 5: Building Building CSCE 471/871

Anomalies Detection for HEP Experiments Maxim Borisyak Denis Derkach, Fedor Ratnikov, Andrey

Extraction and Applications of Implicit Networks from Unstructured Text Andreas Spitz Heidelberg

EdgeL 3 : Compressing L 3 -Net for Mote-Scale Urban Noise Monitoring Sangeeta Kumari Dhrubojyoti

Proposition of a mechanism to divide a MANET network into subnetworks of given size What are we

An End-to-End Infrastructure for Cyber-Physical Intrusion Detection REINHARD GENTZ, MAHDI JAMEI,

SUBROUTINES AND CONTROL ABSTRACTION PRINCIPLES OF PROGRAMMING LANGUAGES Norbert Zeh Winter 2018

Subroutines Control abstraction Subroutine design If a subroutine does not fit on the

EE 109 Unit 15 Subroutines Program Counter and GPRs (especially $sp, $ra, and $fp) REVIEW OF