Deep Learning in Computer Vision
Deep Learning in Action
- 24. June ’15
Deep Learning in Computer Vision Caner Hazrba Deep Learning in - - PowerPoint PPT Presentation
Deep Learning in Computer Vision Caner Hazrba Deep Learning in Action 24. June 15 Computer Vision Group 6 Postdocs, 16 PhD students Caner Hazrba | vision.in.tum.de Deep Learning in Computer Vision 2 Research in Computer Vision
Deep Learning in Computer Vision Caner Hazırbaş | vision.in.tum.de
2
Deep Learning in Computer Vision Caner Hazırbaş | vision.in.tum.de
3
Deep Learning in Computer Vision Caner Hazırbaş | vision.in.tum.de
5
edges classifier (or any other hand-crafted features) Person
Deep Learning in Computer Vision Caner Hazırbaş | vision.in.tum.de
6
edges classifier Person (or any other hand-crafted features)
Deep Learning in Computer Vision Caner Hazırbaş | vision.in.tum.de
Learning good features automatically from raw data
7
Google’s cat detection neural network
Deep Learning in Computer Vision Caner Hazırbaş | vision.in.tum.de
8
w1 w2 w3 “non-linear” transformation 1 b
Deep Learning in Computer Vision Caner Hazırbaş | vision.in.tum.de
9
Input ‘Pixels’ 1st and 2nd Layers ‘Edges’ 3rd Layer ‘Object Parts’ 4th Layer ‘Objects’
third layer
faces faces cars airplanes motorbikes
Deep Learning in Computer Vision Caner Hazırbaş | vision.in.tum.de
Unsupervised Methods
10
encode decode
Deep Learning in Computer Vision Caner Hazırbaş | vision.in.tum.de
Supervised Methods
11
Vision Deep CNN Language Generating RNN A group of people shopping at an outdoor market. There are many vegetables at the fruit stand.
Deep Learning in Computer Vision Caner Hazırbaş | vision.in.tum.de
Stochastic Gradient Descent — supervised learning
12
Deep Learning in Computer Vision Caner Hazırbaş | vision.in.tum.de
(e.g. 2D images, 3D video/volumetric images)
and pooling layers
previous layer
13
20 15 x 54 15 x 54 8 x 27 4 x 14 50 500 x 1 378 x 1 E A q y B 4 conv1 pool1 conv pool2 10% 20 8 x 27 50
Deep Learning in Computer Vision Caner Hazırbaş | vision.in.tum.de
14
Convolutional networks take advantage of the properties of natural signals:
Deep Learning in Computer Vision Caner Hazırbaş | vision.in.tum.de
15
Convolutional networks take advantage of the properties of natural signals:
Deep Learning in Computer Vision Caner Hazırbaş | vision.in.tum.de
16
Convolutional networks take advantage of the properties of natural signals:
Deep Learning in Computer Vision Caner Hazırbaş | vision.in.tum.de
Convolutional networks take advantage of the properties of natural signals:
17
Person
Deep Learning in Computer Vision Caner Hazırbaş | vision.in.tum.de
Computer Vision tasks
scale problems, e.g. classification
18
data
tuning hyper-parameters)
Deep Learning in Computer Vision Caner Hazırbaş | vision.in.tum.de
20
Deep Learning in Computer Vision Caner Hazırbaş | vision.in.tum.de
21
Deep Learning in Computer Vision Caner Hazırbaş | vision.in.tum.de
22
in collaboration with University of Freiburg lmb.informatik.uni-freiburg.de
Deep Learning in Computer Vision Caner Hazırbaş | vision.in.tum.de
23
Deep Learning in Computer Vision Caner Hazırbaş | vision.in.tum.de
24
96 x 128 9 192 x 256 6 64 128 256 256 512 512 512 512 1024 5 x 5 5 x 5 3 x 3 conv6 prediction conv5_1 conv5 conv4_1 conv4 conv3_1 conv3 conv2 conv1 136 x 320 7 x 7 384 x 512 refine- ment conv1 conv2 conv3 corr conv_redir conv3_1 conv4 conv4_1 conv5 conv5_1 conv6 3 64 128 256 441 32 473 256 512 512 512 512 1024 384 x 512 sqrt prediction 136 x 320 refine- ment 4 x 512 4 x 512 2 kernel 7 x 7 5 x 5 1 x 1 1 x 1 3 x 3
FlowNetSimple FlowNetCorr
Deep Learning in Computer Vision Caner Hazırbaş | vision.in.tum.de
25
corr conv_redir 441 256 sqrt kernel 1 x 1 1 x 1 3 x 3
Deep Learning in Computer Vision Caner Hazırbaş | vision.in.tum.de
26
Deep Learning in Computer Vision Caner Hazırbaş | vision.in.tum.de
27
Vision Deep CNN Language Generating RNN A group of people shopping at an outdoor market. There are many vegetables at the fruit stand.
A woman is throwing a frisbee in a park. A little girl sitting on a bed with a teddy bear. A group of people sitting on a boat in the water. A girafge standing in a forest with trees in the background. A dog is standing on a hardwood fmoor. A stop sign is on a road with a mountain in the background
Vision Deep CNN Language Generating RNN A group of people shopping at an outdoor market. There are many vegetables at the fruit stand.
End of Presentation
Caner Hazırbaş | hazirbas@cs.tum.edu
Deep Learning in Computer Vision Caner Hazırbaş | vision.in.tum.de
Quoc V. Le , Rajat Monga , Matthieu Devin , Kai Chen , Greg S. Corrado , Jeff Dean , Andrew Y. Ng ICML’12
Hierarchical Representations
Honglak Lee Roger Grosse Rajesh Ranganath Andrew Y. Ng ICML’09
Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton NIPS’12
Philipp Fischer, Alexey Dosovitskiy, Eddy Ilg, Philip Häusser, Caner Hazırbaş, Vladimir Golkov, Patrick van der Smagt, Daniel Cremers, Thomas Brox
29
Deep Learning in Computer Vision Caner Hazırbaş | vision.in.tum.de
selection-technology/deep-learning-image-classification/
descent/
30