deep learning in computer vision
play

Deep Learning in Computer Vision Caner Hazrba Deep Learning in - PowerPoint PPT Presentation

Deep Learning in Computer Vision Caner Hazrba Deep Learning in Action 24. June 15 Computer Vision Group 6 Postdocs, 16 PhD students Caner Hazrba | vision.in.tum.de Deep Learning in Computer Vision 2 Research in Computer Vision


  1. Deep Learning in Computer Vision Caner Hazırba ş Deep Learning in Action 
 24. June ’15

  2. Computer Vision Group 6 Postdocs, 16 PhD students Caner Hazırba ş | vision.in.tum.de Deep Learning in Computer Vision 2

  3. Research in Computer Vision Robot Vision Shape Analysis Image-based 3D 
 Reconstruction Image 
 RGB-D Vision Visual SLAM Segmentation Optical Flow Convex 
 Relaxation 
 Methods Caner Hazırba ş | vision.in.tum.de Deep Learning in Computer Vision 3

  4. Deep Learning 
 in Computer Vision

  5. How to teach a machine ? edges classifier Person (or any other hand-crafted features) Caner Hazırba ş | vision.in.tum.de Deep Learning in Computer Vision 5

  6. How to teach a machine ? n o i t a t n e edges classifier s e r p Person e r d o o g a t o N (or any other hand-crafted features) Caner Hazırba ş | vision.in.tum.de Deep Learning in Computer Vision 6

  7. What is deep learning ? Representation learning method 
 • Learning good features automatically from raw data Learning representations of data with multiple levels of abstraction • Google’s cat detection neural network Caner Hazırba ş | vision.in.tum.de Deep Learning in Computer Vision 7

  8. Construction of higher 
 levels of abstraction w 1 w 2 w 3 b “non-linear” 
 transformation 1 Caner Hazırba ş | vision.in.tum.de Deep Learning in Computer Vision 8

  9. Going deeper in the network Input 
 1st and 2nd Layers 3rd Layer 4th Layer ‘Pixels’ ‘Edges’ ‘Object Parts’ ‘Objects’ faces faces cars airplanes motorbikes Caner Hazırba ş | vision.in.tum.de Deep Learning in Computer Vision 9 third layer

  10. Deep Learning Methods Unsupervised Methods • Restricted Boltzmann Machines • Deep Belief Networks • Auto encoders: unsupervised feature extraction/learning encode decode Caner Hazırba ş | vision.in.tum.de Deep Learning in Computer Vision 10

  11. Deep Learning Methods Supervised Methods Deep Neural Networks • Recurrent Neural Networks • Convolutional Neural Networks • Language Vision Generating RNN Deep CNN A group of people shopping at an outdoor market. There are many vegetables at the fruit stand. Caner Hazırba ş | vision.in.tum.de Deep Learning in Computer Vision 11

  12. How to train a deep network ? Stochastic Gradient Descent — supervised learning • show input vector of few examples • compute the output and the errors • compute average gradient • update the weights accordingly Caner Hazırba ş | vision.in.tum.de Deep Learning in Computer Vision 12

  13. Convolutional Neural Networks CNNs are designed to process the data in the form of multiple arrays • (e.g. 2D images, 3D video/volumetric images) Typical architecture is composed of series of stages: convolutional layers • and pooling layers Each unit is connected to local patches in the feature maps of the • previous layer 10% E A q y B 4 50 20 50 20 4 x 14 8 x 27 8 x 27 15 x 54 15 x 54 pool2 conv1 pool1 conv 378 x 1 500 x 1 Caner Hazırba ş | vision.in.tum.de Deep Learning in Computer Vision 13

  14. Key Idea behind 
 Convolutional Networks Convolutional networks take advantage of the properties of natural signals: • local connections Caner Hazırba ş | vision.in.tum.de Deep Learning in Computer Vision 14

  15. Key Idea behind 
 Convolutional Networks Convolutional networks take advantage of the properties of natural signals: • local connections • shared weights Caner Hazırba ş | vision.in.tum.de Deep Learning in Computer Vision 15

  16. Key Idea behind 
 Convolutional Networks Convolutional networks take advantage of the properties of natural signals: • local connections • shared weights • pooling Caner Hazırba ş | vision.in.tum.de Deep Learning in Computer Vision 16

  17. Key Idea behind 
 Convolutional Networks Convolutional networks take advantage of the properties of natural signals: • local connections • shared weights • pooling • the use of many layers Person Caner Hazırba ş | vision.in.tum.de Deep Learning in Computer Vision 17

  18. Pros & Cons Best performing method in many Need of huge amount of training • • Computer Vision tasks data No need of hand-crafted features Hard to train (local minima problem, • • tuning hyper-parameters) Most applicable method for large- • scale problems, e.g. classification Difficult to analyse ( to be solved ) • of 1000 classes Easy parallelization on GPUs • Caner Hazırba ş | vision.in.tum.de Deep Learning in Computer Vision 18

  19. Deep Learning Applications 
 in Computer Vision

  20. Handwritten Digit Recognition Caner Hazırba ş | vision.in.tum.de Deep Learning in Computer Vision 20

  21. ImageNet Classification with Deep Convolutional Neural Networks (AlexNet) Caner Hazırba ş | vision.in.tum.de Deep Learning in Computer Vision 21

  22. FlowNet: Learning Optical Flow with Convolutional Networks in collaboration with University of Freiburg 
 lmb.informatik.uni-freiburg.de Caner Hazırba ş | vision.in.tum.de Deep Learning in Computer Vision 22

  23. FlowNet: Learning Optical Flow with Convolutional Networks Caner Hazırba ş | vision.in.tum.de Deep Learning in Computer Vision 23

  24. FlowNet: Learning Optical Flow with Convolutional Networks FlowNetSimple conv1 conv2 conv3 conv3_1 conv4 conv4_1 conv5 conv5_1 conv6 7 x 7 refine- prediction 5 x 5 ment 3 x 3 5 x 5 1024 96 x 128 9 512 512 192 x 256 512 512 256 256 384 x 512 136 x 320 128 64 6 FlowNetCorr conv1 conv2 conv3 conv_redir 1 x 1 7 x 7 sqrt 1 x 1 5 x 5 conv3_1 conv4 conv4_1 conv5 conv5_1 conv6 384 x 512 256 4 x 512 4 x 512 128 64 2 refine- kernel prediction 3 x 3 3 corr ment 1024 512 512 512 512 32 136 x 320 256 441 473 Caner Hazırba ş | vision.in.tum.de Deep Learning in Computer Vision 24

  25. FlowNet: Learning Optical Flow with Convolutional Networks conv_redir 1 x 1 sqrt 1 x 1 256 kernel 3 x 3 corr 441 Caner Hazırba ş | vision.in.tum.de Deep Learning in Computer Vision 25

  26. FlowNet: Learning Optical Flow with Convolutional Networks Caner Hazırba ş | vision.in.tum.de Deep Learning in Computer Vision 26

  27. From Image to Caption Language Vision Generating RNN Deep CNN A group of people shopping at an outdoor market. There are many vegetables at the fruit stand. A woman is throwing a frisbee in a park. A dog is standing on a hardwood fm oor. A stop sign is on a road with a mountain in the background A little girl sitting on a bed with a teddy bear. A group of people sitting on a boat in the water. A gira fg e standing in a forest with trees in the background. Caner Hazırba ş | vision.in.tum.de Deep Learning in Computer Vision 27

  28. Deep Learning in Computer Vision Caner Hazırba ş | hazirbas@cs.tum.edu Language Vision Generating RNN Deep CNN A group of people shopping at an outdoor End of 
 Questions ? market. Presentation There are many vegetables at the fruit stand.

  29. References Building High-level Features Using Large Scale Unsupervised Learning 
 • Quoc V. Le , Rajat Monga , Matthieu Devin , Kai Chen , Greg S. Corrado , Jeff Dean , Andrew Y. Ng ICML’12 Convolutional Deep Belief Networks for Scalable Unsupervised Learning of • Hierarchical Representations 
 Honglak Lee Roger Grosse Rajesh Ranganath Andrew Y. Ng ICML’09 ImageNet Classification with Deep Convolutional Neural Networks 
 • Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton NIPS’12 Gradient-based learning applied to document recognition. 
 • Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner Proceedings of the IEEE’98 FlowNet: Learning Optical Flow with Convolutional Networks 
 • Philipp Fischer, Alexey Dosovitskiy, Eddy Ilg, Philip Häusser, Caner Hazırba ş , Vladimir Golkov, Patrick van der Smagt, Daniel Cremers, Thomas Brox Caner Hazırba ş | vision.in.tum.de Deep Learning in Computer Vision 29

  30. References Google’s cat detection neural network http://www.resnap.com/image- • selection-technology/deep-learning-image-classification/ Example auto-encoder : http://nghiaho.com/?p=1765 • SGD : http://blog.datumbox.com/tuning-the-learning-rate-in-gradient- • descent/ Caner Hazırba ş | vision.in.tum.de Deep Learning in Computer Vision 30

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend