Convolutional Neural Networks for Computer Vision Caner Hazrba - PowerPoint PPT Presentation

Convolutional Neural Networks for Computer Vision Caner Hazırba ş Centrum für Informations- und Sprachverarbeitung   24. November ’15

Computer Vision Group 5 Postdocs, 24 PhD students Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 2

Research in Computer Vision Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 3

Convolutional Neural Networks for Computer Vision

What is deep learning ? Representation learning method   • Learning good features automatically from raw data Learning representations of data with multiple levels of abstraction • Google’s cat detection neural network Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 5

Going deeper in the network Input   1st and 2nd Layers 3rd Layer 4th Layer ‘Pixels’ ‘Edges’ ‘Object Parts’ ‘Objects’ faces faces cars airplanes motorbikes Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 6 third layer

Deep Learning Methods Unsupervised Methods • Restricted Boltzmann Machines • Deep Belief Networks • Auto encoders: unsupervised feature extraction/learning encode decode Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 7

Deep Learning Methods Supervised Methods Deep Neural Networks • Recurrent Neural Networks • Convolutional Neural Networks • Language Vision Generating RNN Deep CNN A group of people shopping at an outdoor market. There are many vegetables at the fruit stand. Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 8

How to train a deep network ? Stochastic Gradient Descent — supervised learning • show input vector of few examples • compute the output and the errors • compute average gradient • update the weights accordingly Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 9

How to train a deep network ? Alternatives: • AdaGrad, AdaDelta, NAG (Nesterov’s Accelerated Gradient)… • ADAM (now in Caffe - http://caffe.berkeleyvision.org/tutorial/solver.htm l )   The Adam is a gradient-based optimization method (like SGD). This includes an “adaptive moment estimation” (m t ,v t ) and can be regarded as a generalization of AdaGrad. The update formulas are: ( m t ) i = β 1 ( m t − 1 ) i + (1 � β 1 )( r L ( W t )) i , ( v t ) i = β 2 ( v t − 1 ) i + (1 � β 2 )( r L ( W t )) 2 i p 1 − ( β 2 ) t ( m t ) i i ( W t +1 ) i = ( W t ) i − α . 1 − ( β 1 ) t p ( v t ) i + ε i D. Kingma, J. Ba. Adam: A Method for Stochastic Optimization. International Conference for Learning Representations, 2015 Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 10

Convolutional Neural Networks CNNs are designed to process the data in the form of multiple arrays • (e.g. 2D images, 3D video/volumetric images) Typical architecture is composed of series of stages: convolutional layers • and pooling layers Each unit is connected to local patches in the feature maps of the • previous layer 10% E A q y B 4 50 20 50 20 4 x 14 8 x 27 8 x 27 15 x 54 15 x 54 pool2 conv1 pool1 conv 378 x 1 500 x 1 Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 11

Key Idea behind   Convolutional Networks Convolutional networks take advantage of the properties of natural signals: • local connections • shared weights • pooling • the use of many layers Person Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 12

FlowNet: Learning Optical Flow with Convolutional Networks Philipp Fischer, Alexey Dosovitskiy, Eddy Ilg, Thomas Brox Philip Häusser, Caner Hazırba ş , Vladimir Golkov, Daniel Cremers, Patrick van der Smagt

Flying Chairs Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 14

Flying Chairs Flying Chairs Sintel 9 8 x 10 x 10 2.5 3 2 Number of pixels 2.5 Number of pixels 2 1.5 1.5 1 1 0.5 0.5 0 50 100 0 50 100 Displacement (px) Displacement (px) Flying Chairs Sintel Number of pixels (log scale) Number of pixels (log scale) 8 10 6 10 0 50 100 0 50 100 Displacement (px) Displacement (px) Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 15

Data Augmentation Generated Augmented • translation , rotation , scaling , additive Gaussian noise • changes in brightness , contrast , gamma and colour 16 Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision

FlowNetSimple FlowNetSimple conv1 conv2 conv3 conv3_1 conv4 conv4_1 conv5 conv5_1 conv6 7 x 7 refine- prediction 5 x 5 ment 3 x 3 5 x 5 1024 96 x 128 9 512 512 192 x 256 512 512 256 256 384 x 512 136 x 320 128 64 6 Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 17

FlowNetSimple - Flying Chairs FlowNetSimple conv1 conv2 conv3 conv3_1 conv4 conv4_1 conv5 conv5_1 conv6 7 x 7 refine- prediction 5 x 5 ment 3 x 3 5 x 5 1024 96 x 128 9 512 512 192 x 256 512 512 256 256 384 x 512 136 x 320 128 64 6 Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 18

FlowNetSimple - Sintel FlowNetSimple conv1 conv2 conv3 conv3_1 conv4 conv4_1 conv5 conv5_1 conv6 7 x 7 refine- prediction 5 x 5 ment 3 x 3 5 x 5 1024 96 x 128 9 512 512 192 x 256 512 512 256 256 384 x 512 136 x 320 128 64 6 Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 19

FlowNetCorr FlowNetCorr conv1 conv2 conv3 conv_redir 1 x 1 7 x 7 sqrt 1 x 1 5 x 5 conv3_1 conv4 conv4_1 conv5 conv5_1 conv6 384 x 512 256 4 x 512 4 x 512 128 64 2 refine- prediction kernel 3 x 3 3 corr ment 1024 512 512 512 512 32 136 x 320 256 441 473 Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 20

Correlation Layer conv_redir 1 x 1 sqrt 1 x 1 256 kernel 3 x 3 corr 441 Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 21

FlowNetCorr - Flying Chairs FlowNetCorr conv1 conv2 conv3 conv_redir 1 x 1 7 x 7 sqrt 1 x 1 5 x 5 conv3_1 conv4 conv4_1 conv5 conv5_1 conv6 384 x 512 256 4 x 512 4 x 512 128 64 2 refine- prediction kernel 3 x 3 3 corr ment 1024 512 512 512 512 32 136 x 320 256 441 473 Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 22

Simple vs. Corr - Flying Chairs FlowNetCorr conv1 conv2 conv3 conv_redir 1 x 1 7 x 7 sqrt 1 x 1 5 x 5 conv3_1 conv4 conv4_1 conv5 conv5_1 conv6 384 x 512 256 4 x 512 4 x 512 128 64 2 refine- prediction kernel 3 x 3 3 corr ment 1024 512 512 512 512 32 136 x 320 256 441 473 FlowNetS FlowNetCorr Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 23

FlowNetCorr - Sintel FlowNetCorr conv1 conv2 conv3 conv_redir 1 x 1 7 x 7 sqrt 1 x 1 5 x 5 conv3_1 conv4 conv4_1 conv5 conv5_1 conv6 384 x 512 256 4 x 512 4 x 512 128 64 2 refine- prediction kernel 3 x 3 3 corr ment 1024 512 512 512 512 32 136 x 320 256 441 473 Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 24

Simple vs. Corr - Sintel FlowNetCorr conv1 conv2 conv3 conv_redir 1 x 1 7 x 7 sqrt 1 x 1 5 x 5 conv3_1 conv4 conv4_1 conv5 conv5_1 conv6 384 x 512 256 4 x 512 4 x 512 128 64 2 refine- prediction kernel 3 x 3 3 corr ment 1024 512 512 512 512 32 136 x 320 256 441 473 FlowNetS FlowNetCorr Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 25

FlowNetSimple + Variational Smoothing Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 26

FlowNet: Learning Optical Flow with Convolutional Networks Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 27

References Building High-level Features Using Large Scale Unsupervised Learning   • Quoc V. Le , Rajat Monga , Matthieu Devin , Kai Chen , Greg S. Corrado , Jeff Dean , Andrew Y. Ng ICML’12 Convolutional Deep Belief Networks for Scalable Unsupervised Learning of • Hierarchical Representations   Honglak Lee Roger Grosse Rajesh Ranganath Andrew Y. Ng ICML’09 ImageNet Classification with Deep Convolutional Neural Networks   • Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton NIPS’12 Gradient-based learning applied to document recognition.   • Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner Proceedings of the IEEE’98 FlowNet: Learning Optical Flow with Convolutional Networks   • Philipp Fischer, Alexey Dosovitskiy, Eddy Ilg, Philip Häusser, Caner Hazırba ş , Vladimir Golkov, Patrick van der Smagt, Daniel Cremers, Thomas Brox Caner Hazırba ş | vision.in.tum.de Convolutional Neural Networks for Computer Vision 28

Convolutional Neural Networks for Computer Vision Caner Hazrba - PowerPoint PPT Presentation

Convolutional Neural Networks for Computer Vision Caner Hazrba Centrum fr Informations- und Sprachverarbeitung 24. November 15 Computer Vision Group 5 Postdocs, 24 PhD students Caner Hazrba | vision.in.tum.de Convolutional

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Convolutional Neural Networks ---- Off the shelf top notch performances Convolutional Neural

Convolutional Kuan-Ting Lai 2020/3/31 Neural Network Convolutional Neural Networks (CNN)

Introduction CSCE 970 CSCE 970 Lecture 4: Lecture 4: Convolutional Convolutional Neural

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

Convolutional Neural Networks for Sentence Classification Yoon Kim New York University 1 / 34

Convolutional Neural Networks 08, 10 & 17 Nov, 2016 J. Ezequiel Soto S. Image Processing

Convolutional Neural Nets 4-25-16 Reading Quiz Convolutional neural networks are most commonly

CS4501: Introduction to Computer Vision Deeper Convolutional Neural Network Architectures Last

Neural Networks + Convolutional Neural Networks Last Class Global Features The perceptron

Neural Network Part 3: Convolutional Neural Networks CS 760@UW-Madison Goals for the lecture

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Semantic Segmentation of the sekleton in bone scintigraphy images with convolutional neural

Convolutional Neural Networks in Speech Lecture 20 CS 753 Instructor: Preethi Jyothi

Convolutional Neural Networks (Part III) 08, 10 & 17 Nov, 2016 J. Ezequiel Soto S. Image

MICROBOONE Taritree Wongjirad DPF 2017 Tufts/MIT Outline Convolutional neural networks

3D Computer Vision Dmitry Chetverikov, Levente Hajder Etvs Lornd University, Faculty of

Computer Vision and Deep Learning Introduction to Data Science 2019 University of Helsinki Mats

COMPUTER VISION Robust estimation Emanuel Aldea < emanuel.aldea@u-psud.fr >

Early Face Recognition Systems in Computer Vision Kanade feature-based face recognition (1973!)

Delving Deep into Computer Vision Caner Hazirbas Machine Learning Meetup #1 Delving Deep into

Computer'Vision Course'Introduction Prof.&Flvio&Cardeal&

Lecture 4 Noisy Channel Coding I-Hsiang Wang Department of Electrical Engineering National

Text-to-Speech Synthesis Bernd Mbius Language Science and Technology Saarland University

Convolutional Neural Networks for Computer Vision Caner Hazrba - PowerPoint PPT Presentation

Convolutional Neural Networks for Computer Vision Caner Hazrba Centrum fr Informations- und Sprachverarbeitung 24. November 15 Computer Vision Group 5 Postdocs, 24 PhD students Caner Hazrba | vision.in.tum.de Convolutional

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Convolutional Neural Networks ---- Off the shelf top notch performances Convolutional Neural

Convolutional Kuan-Ting Lai 2020/3/31 Neural Network Convolutional Neural Networks (CNN)

Introduction CSCE 970 CSCE 970 Lecture 4: Lecture 4: Convolutional Convolutional Neural

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

Convolutional Neural Networks for Sentence Classification Yoon Kim New York University 1 / 34

Convolutional Neural Networks 08, 10 &amp; 17 Nov, 2016 J. Ezequiel Soto S. Image Processing

Convolutional Neural Nets 4-25-16 Reading Quiz Convolutional neural networks are most commonly

CS4501: Introduction to Computer Vision Deeper Convolutional Neural Network Architectures Last

Neural Networks + Convolutional Neural Networks Last Class Global Features The perceptron

Neural Network Part 3: Convolutional Neural Networks CS 760@UW-Madison Goals for the lecture

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Semantic Segmentation of the sekleton in bone scintigraphy images with convolutional neural

Convolutional Neural Networks in Speech Lecture 20 CS 753 Instructor: Preethi Jyothi

Convolutional Neural Networks (Part III) 08, 10 &amp; 17 Nov, 2016 J. Ezequiel Soto S. Image

MICROBOONE Taritree Wongjirad DPF 2017 Tufts/MIT Outline Convolutional neural networks

3D Computer Vision Dmitry Chetverikov, Levente Hajder Etvs Lornd University, Faculty of

Computer Vision and Deep Learning Introduction to Data Science 2019 University of Helsinki Mats

COMPUTER VISION Robust estimation Emanuel Aldea &lt; emanuel.aldea@u-psud.fr &gt;

Early Face Recognition Systems in Computer Vision Kanade feature-based face recognition (1973!)

Delving Deep into Computer Vision Caner Hazirbas Machine Learning Meetup #1 Delving Deep into

Computer'Vision Course'Introduction Prof.&amp;Flvio&amp;Cardeal&amp;

Lecture 4 Noisy Channel Coding I-Hsiang Wang Department of Electrical Engineering National

Text-to-Speech Synthesis Bernd Mbius Language Science and Technology Saarland University

Convolutional Neural Networks 08, 10 & 17 Nov, 2016 J. Ezequiel Soto S. Image Processing

Convolutional Neural Networks (Part III) 08, 10 & 17 Nov, 2016 J. Ezequiel Soto S. Image

COMPUTER VISION Robust estimation Emanuel Aldea < emanuel.aldea@u-psud.fr >

Computer'Vision Course'Introduction Prof.&Flvio&Cardeal&