8. Other Deep Architectures CS 519 Deep Learning, Winter 2018 Fuxin - - PowerPoint PPT Presentation

▶

Oct 20, 2023 173 likes •442 views

8. Other Deep Architectures CS 519 Deep Learning, Winter 2018 Fuxin Li With materials from Zsolt Kira and Ian Goodfellow A brief overview of other architectures Unsupervised Architectures Deep Belief Networks Autoencoders GANs

SLIDE 1

8. Other Deep Architectures

CS 519 Deep Learning, Winter 2018 Fuxin Li

With materials from Zsolt Kira and Ian Goodfellow

SLIDE 2

A brief overview of other architectures

Unsupervised Architectures
Deep Belief Networks
Autoencoders
GANs
Temporal Architectures
Recurrent Neural Networks (RNN)
LSTM
We will carefully cover those items later
Right now just a brief overview in case that you might be tempted to use

them in your project

SLIDE 3

Unsupervised Deep Learning

CNN is most successful with a lot of training examples
What can we do if we do not have any training example?
Or have very few of them?

SLIDE 4

Remember PCA: Characteristics and Limitations

Easy: Can perform Eigen decomposition
Select first K components based on how much variance is capture
Bases are orthogonal
Optimal under some assumptions (Gaussian)
Assumptions almost never true in real data

SLIDE 5

PCA as a “neural network”

Input vector Input vector code

PCA goal:
Minimize reconstruction error

min

𝐖 ෍ 𝑗=1 𝑜

𝒚𝒋 − 𝐖𝐖⊤𝒚𝑗

𝒚𝒋 𝐖⊤𝒚𝑗 𝐖𝐖⊤𝒚𝑗

SLIDE 6

Generalize PCA to multi-layer nonlinear network

input vector

utput vector

code

Many encoding layers Many decoding layers

Deep Autoencoder
Same as other NN (linear

transform + nonlinearity + linear transform etc.)

Only difference is that after

decoding, strive to reconstruct the original input

Can have

convolutional/fully- connected/sparse versions

SLIDE 7

Krizhevsky’s deep autoencoder

1024 1024 1024 8192 4096 2048 1024 512

256-bit binary code The encoder has about 67,000,000 parameters. It takes a few days on a GTX 285 GPU to train on two million images (Tiny dataset)

SLIDE 8

Reconstructions of 32x32 color images from 256-bit codes

SLIDE 9

retrieved using 256 bit codes retrieved using Euclidean distance in pixel intensity space

SLIDE 10

retrieved using 256 bit codes retrieved using Euclidean distance in pixel intensity space

SLIDE 11

Generative Adversarial Networks

SLIDE 12

Generative Adversarial Networks

Cost for the discriminator:
Standard cross-entropy loss, with everything from 𝑞𝑒𝑏𝑢𝑏 label 1, and

everything from 𝑨 label 0

Cost for the generator:
Try to generate examples to “fool” the discriminator

SLIDE 13

DCGAN

SLIDE 14

Samples of DCGAN-generated images

SLIDE 15

DCGAN representations

SLIDE 16

Text-to-Image with GANs

SLIDE 17

Text-to-Image with GANs

SLIDE 18

Problems

SLIDE 19

Problems

SLIDE 20

iGAN

https://www.youtube.com/watch?v=9c4z6YsBGQ0

SLIDE 21

Temporal, Sequences
Tied weights
Some additional variants: Recursive Autoencoders, Long

Short-Term Memory (LSTM)

Recurrent Neural Networks (RNNs)

SLIDE 22

Machine Translation

Have to look at the entire sentence (or, many sentences)

SLIDE 23

Image Captioning

SLIDE 24

Restricted Boltzmann Machines

Generative version of the encoder
Binary-valued hidden variables
Define probabilities such as 𝑄 ℎ𝑗 𝑌 and 𝑄(𝑦𝑗|𝐼)
You can generate samples of observed variables from hidden
Think as an extension of probabilistic PCA
Only if you are into generative models (PGM class)
Unsupervised pre-training method to train it (Hinton, Salakhutdinov

2006)

Convolutional and fully connected version available
Doesn’t perform very well..

SLIDE 25

Fooling a deep network(Szegedy et al. 2013)

Optimizing a delta from the image to maximize a class prediction 𝑔

𝑑(𝑦)

m𝑏𝑦

Δ𝐽

𝑔

𝑑 𝐽 + Δ𝐽 − 𝜇||Δ𝐽||2 (Szegedy et al. 2013, Goodfellow et al. 2014, Nguyen et al. 2015) Goldfish (95.15% confidence) Shark (93.89% confidence)

= =

+0.03 +0.03 Giant Panda (99.32% confidence)

𝐽 Δ𝐽 Δ𝐽