Au Autoenc ncoders
- Prof. Leal-Taixé and Prof. Niessner
1
Au Autoenc ncoders Prof. Leal-Taix and Prof. Niessner 1 Mac - - PowerPoint PPT Presentation
Au Autoenc ncoders Prof. Leal-Taix and Prof. Niessner 1 Mac Machine e lear earning Unsupervised learning Supervised learning Labels or target classes Goal: learn a mapping from input to label Classification,
1
2
classes
from input to label
regression
Unsupervised learning Supervised learning
DOG DOG DOG CAT CAT CAT
3
Unsupervised learning Supervised learning
Unsupervised learning Supervised learning
the structure of the data
PCA)
4
DOG DOG DOG CAT CAT CAT
DOG DOG DOG CAT CAT CAT
5
Unsupervised learning Supervised learning
6
Unsupervised learning Supervised learning
DOG DOG DOG CAT CAT CAT
7
dimensional feature representation from unlabeled training data
8
image to a feature representation (bottleneck layer)
9
Conv Input Image
10
11
Conv Transpose Conv Input Image Output Image
Reconstruction Loss (like L1, L2)
12
Latent space z dim (z) < dim (x) Input x Reconstruction x’ Input images Reconstructed images
required
unlabeled data to first get its structure
13
Latent space z dim (z) < dim (x) Input x Reconstruction x’
14
Embedding of MNIST numbers
– Large set of unlabeled data. – Small set of labeled data.
15
– Large set of unlabeled data. – Small set of labeled data.
autoencoder to “learn” the type of features present in CT images
16
17
Input Reconstruction
18
Input Reconstruction Throw away the decoder
19
Input
Ground truth labels for supervised learning Loss Backprop as always
– Image à same image reconstructed – Use the encoder as “feature extractor”
– Image à semantic segmentation – Low-resolution image à High-resolution image – Image à Depth map
20
21
22
[Long et al. 15] Fully Convolutional Networks for Semantic Segmetnation (FCN)
Can we do better?
23
Badrinarayanan et al. „SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation“. TPAMI 2016
Encoder: normal convolutional filters + pooling
Decoder: Upsampling + convolutional filters
24
Badrinarayanan et al. „SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation“. TPAMI 2016
Encoder: normal convolutional filters + pooling
Decoder: Upsampling + convolutional filters
25
Badrinarayanan et al. „SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation“. TPAMI 2016
Encoder: normal convolutional filters + pooling
Decoder: Upsampling + convolutional filters
using backprop and their goal is to refine the upsampling
26
Badrinarayanan et al. „SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation“. TPAMI 2016
(never deconvolution)
27
Output 5x5 Input 3x3
Encoder: normal convolutional filters + pooling
Decoder: Upsampling + convolutional filters
ax layer: The output of the soft-max classifier is a K channel image of probabilities where K is the number of classes.
28
Badrinarayanan et al. „SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation“. TPAMI 2016
29
30
?
31
Original image Nearest neighbor interpolation Bilinear interpolation Bicubic interpolation
Image: Michael Guerzhoy
Few artifacts
32
Image: Michael Guerzhoy
33
+ CONVS
efficient
Keep the locations where the max came from
34
35
Now: convolutional filters are LEARNED In DeConvNet: we convolve with the transpose of the learned filter
Keep the details of the structures
36
38
39
Pass the low- level information High-level information Recall ResNet
40
append
41
42
depends on your problem
43
44
Badrinarayanan et al. „SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation“. TPAMI 2016
45
Badrinarayanan et al. „SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation“. TPAMI 2016
Input Ground truth SegNet
46
– The content of the image needs to pass through the network (skip connections [2] or other strategies [1]).
47
[2] XJ. Mao et al. “Image Restoration Using Very Deep Convolutional Encoder-Decoder Networks with Symmetric Skip Connections“. NIPS 2016 [1] C. Dong et al. „Image Super-Resolution Using Deep Convolutional Networks“. TPAMI 2015
48
49
50
loss to achieve high quality results
results (black car vs white car)
51
52
reconstruction loss).
53
Gatys et al „A neural algorithm of artistic style“. arXiv preprint arXiv:1508.06576 (2015)
through the network
54
Feature maps of the generated image at layer j Feature maps of the ground truth image at layer j Feature map size (channels, height, width)
want to have “similar” features triggered for the generated image
generated image too (but, e.g., color does not matter)
55
transfer [1]
56
[1] Gatys et al „A neural algorithm of artistic style“. arXiv preprint arXiv:1508.06576 (2015) Image: J. Johnson
58
Gatys et al „A neural algorithm of artistic style“. arXiv preprint arXiv:1508.06576 (2015)
Gram matrix of the features of layer j
through the network
59
tend to activate together.
C x (HW)
content
60
61
Star art wi with h a a whi white no noise imag age
62
More weight to the content loss More weight to the style loss
Image: Johnson/Fei-Fei/ Yeung
forward/backward passes through VGG.
do the transfer (one network per style)
63
image to compute the loss
64
Content loss Style loss
image to compute the loss
65
Autoencoding Models for Unsupervised Anomaly Segmentation in Brain MR Images“ MICCAI 2018
representation of several sources (audio and video)
66
67