convolutional neural networks for computer vision
play

Convolutional Neural Networks for Computer Vision Caner Hazrba - PowerPoint PPT Presentation

Convolutional Neural Networks for Computer Vision Caner Hazrba Centrum fr Informations- und Sprachverarbeitung 24. November 15 Computer Vision Group 5 Postdocs, 24 PhD students Caner Hazrba | vision.in.tum.de Convolutional


  1. Convolutional Neural Networks for Computer Vision Caner Hazırba ş Centrum für Informations- und Sprachverarbeitung 
 24. November ’15

  2. Computer Vision Group 5 Postdocs, 24 PhD students Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 2

  3. Research in Computer Vision Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 3

  4. Convolutional Neural Networks for Computer Vision

  5. What is deep learning ? Representation learning method 
 • Learning good features automatically from raw data Learning representations of data with multiple levels of abstraction • Google’s cat detection neural network Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 5

  6. Going deeper in the network Input 
 1st and 2nd Layers 3rd Layer 4th Layer ‘Pixels’ ‘Edges’ ‘Object Parts’ ‘Objects’ faces faces cars airplanes motorbikes Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 6 third layer

  7. Deep Learning Methods Unsupervised Methods • Restricted Boltzmann Machines • Deep Belief Networks • Auto encoders: unsupervised feature extraction/learning encode decode Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 7

  8. Deep Learning Methods Supervised Methods Deep Neural Networks • Recurrent Neural Networks • Convolutional Neural Networks • Language Vision Generating RNN Deep CNN A group of people shopping at an outdoor market. There are many vegetables at the fruit stand. Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 8

  9. How to train a deep network ? Stochastic Gradient Descent — supervised learning • show input vector of few examples • compute the output and the errors • compute average gradient • update the weights accordingly Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 9

  10. How to train a deep network ? Alternatives: • AdaGrad, AdaDelta, NAG (Nesterov’s Accelerated Gradient)… • ADAM (now in Caffe - http://caffe.berkeleyvision.org/tutorial/solver.htm l ) 
 The Adam is a gradient-based optimization method (like SGD). This includes an “adaptive moment estimation” (m t ,v t ) and can be regarded as a generalization of AdaGrad. The update formulas are: ( m t ) i = β 1 ( m t − 1 ) i + (1 � β 1 )( r L ( W t )) i , ( v t ) i = β 2 ( v t − 1 ) i + (1 � β 2 )( r L ( W t )) 2 i p 1 − ( β 2 ) t ( m t ) i i ( W t +1 ) i = ( W t ) i − α . 1 − ( β 1 ) t p ( v t ) i + ε i D. Kingma, J. Ba. Adam: A Method for Stochastic Optimization. International Conference for Learning Representations, 2015 Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 10

  11. Convolutional Neural Networks CNNs are designed to process the data in the form of multiple arrays • (e.g. 2D images, 3D video/volumetric images) Typical architecture is composed of series of stages: convolutional layers • and pooling layers Each unit is connected to local patches in the feature maps of the • previous layer 10% E A q y B 4 50 20 50 20 4 x 14 8 x 27 8 x 27 15 x 54 15 x 54 pool2 conv1 pool1 conv 378 x 1 500 x 1 Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 11

  12. Key Idea behind 
 Convolutional Networks Convolutional networks take advantage of the properties of natural signals: • local connections • shared weights • pooling • the use of many layers Person Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 12

  13. FlowNet: Learning Optical Flow with Convolutional Networks Philipp Fischer, Alexey Dosovitskiy, Eddy Ilg, Thomas Brox Philip Häusser, Caner Hazırba ş , Vladimir Golkov, Daniel Cremers, Patrick van der Smagt

  14. Flying Chairs Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 14

  15. Flying Chairs Flying Chairs Sintel 9 8 x 10 x 10 2.5 3 2 Number of pixels 2.5 Number of pixels 2 1.5 1.5 1 1 0.5 0.5 0 50 100 0 50 100 Displacement (px) Displacement (px) Flying Chairs Sintel Number of pixels (log scale) Number of pixels (log scale) 8 10 6 10 0 50 100 0 50 100 Displacement (px) Displacement (px) Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 15

  16. Data Augmentation Generated Augmented • translation , rotation , scaling , additive Gaussian noise • changes in brightness , contrast , gamma and colour 16 Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision

  17. FlowNetSimple FlowNetSimple conv1 conv2 conv3 conv3_1 conv4 conv4_1 conv5 conv5_1 conv6 7 x 7 refine- prediction 5 x 5 ment 3 x 3 5 x 5 1024 96 x 128 9 512 512 192 x 256 512 512 256 256 384 x 512 136 x 320 128 64 6 Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 17

  18. FlowNetSimple - Flying Chairs FlowNetSimple conv1 conv2 conv3 conv3_1 conv4 conv4_1 conv5 conv5_1 conv6 7 x 7 refine- prediction 5 x 5 ment 3 x 3 5 x 5 1024 96 x 128 9 512 512 192 x 256 512 512 256 256 384 x 512 136 x 320 128 64 6 Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 18

  19. FlowNetSimple - Sintel FlowNetSimple conv1 conv2 conv3 conv3_1 conv4 conv4_1 conv5 conv5_1 conv6 7 x 7 refine- prediction 5 x 5 ment 3 x 3 5 x 5 1024 96 x 128 9 512 512 192 x 256 512 512 256 256 384 x 512 136 x 320 128 64 6 Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 19

  20. FlowNetCorr FlowNetCorr conv1 conv2 conv3 conv_redir 1 x 1 7 x 7 sqrt 1 x 1 5 x 5 conv3_1 conv4 conv4_1 conv5 conv5_1 conv6 384 x 512 256 4 x 512 4 x 512 128 64 2 refine- prediction kernel 3 x 3 3 corr ment 1024 512 512 512 512 32 136 x 320 256 441 473 Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 20

  21. Correlation Layer conv_redir 1 x 1 sqrt 1 x 1 256 kernel 3 x 3 corr 441 Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 21

  22. FlowNetCorr - Flying Chairs FlowNetCorr conv1 conv2 conv3 conv_redir 1 x 1 7 x 7 sqrt 1 x 1 5 x 5 conv3_1 conv4 conv4_1 conv5 conv5_1 conv6 384 x 512 256 4 x 512 4 x 512 128 64 2 refine- prediction kernel 3 x 3 3 corr ment 1024 512 512 512 512 32 136 x 320 256 441 473 Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 22

  23. Simple vs. Corr - Flying Chairs FlowNetCorr conv1 conv2 conv3 conv_redir 1 x 1 7 x 7 sqrt 1 x 1 5 x 5 conv3_1 conv4 conv4_1 conv5 conv5_1 conv6 384 x 512 256 4 x 512 4 x 512 128 64 2 refine- prediction kernel 3 x 3 3 corr ment 1024 512 512 512 512 32 136 x 320 256 441 473 FlowNetS FlowNetCorr Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 23

  24. FlowNetCorr - Sintel FlowNetCorr conv1 conv2 conv3 conv_redir 1 x 1 7 x 7 sqrt 1 x 1 5 x 5 conv3_1 conv4 conv4_1 conv5 conv5_1 conv6 384 x 512 256 4 x 512 4 x 512 128 64 2 refine- prediction kernel 3 x 3 3 corr ment 1024 512 512 512 512 32 136 x 320 256 441 473 Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 24

  25. Simple vs. Corr - Sintel FlowNetCorr conv1 conv2 conv3 conv_redir 1 x 1 7 x 7 sqrt 1 x 1 5 x 5 conv3_1 conv4 conv4_1 conv5 conv5_1 conv6 384 x 512 256 4 x 512 4 x 512 128 64 2 refine- prediction kernel 3 x 3 3 corr ment 1024 512 512 512 512 32 136 x 320 256 441 473 FlowNetS FlowNetCorr Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 25

  26. FlowNetSimple + Variational Smoothing Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 26

  27. FlowNet: Learning Optical Flow with Convolutional Networks Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 27

  28. References Building High-level Features Using Large Scale Unsupervised Learning 
 • Quoc V. Le , Rajat Monga , Matthieu Devin , Kai Chen , Greg S. Corrado , Jeff Dean , Andrew Y. Ng ICML’12 Convolutional Deep Belief Networks for Scalable Unsupervised Learning of • Hierarchical Representations 
 Honglak Lee Roger Grosse Rajesh Ranganath Andrew Y. Ng ICML’09 ImageNet Classification with Deep Convolutional Neural Networks 
 • Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton NIPS’12 Gradient-based learning applied to document recognition. 
 • Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner Proceedings of the IEEE’98 FlowNet: Learning Optical Flow with Convolutional Networks 
 • Philipp Fischer, Alexey Dosovitskiy, Eddy Ilg, Philip Häusser, Caner Hazırba ş , Vladimir Golkov, Patrick van der Smagt, Daniel Cremers, Thomas Brox Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 28

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend