BBM413 Fundamentals of Image Processing Introduction to Deep - PowerPoint PPT Presentation

  BBM413   Fundamentals of   Image Processing Introduction to Deep Learning Erkut Erdem   Hacettepe University   Computer Vision Lab (HUCVL)

What is deep learning? Y. LeCun, Y. Bengio, G. Hinton, "Deep Learning" , Nature, Vol. 521, 28 May 2015 “Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction .” − Yann LeCun, Yoshua Bengio and Geoff Hinton � 2

1943 – 2006:   A Prehistory of Deep Learning � 3

1943: Warren McCulloch and Walter Pitts • First computational model • Neurons as logic gates (AND, OR, NOT) • A neuron model that sums binary inputs and outputs   1 if the sum exceeds   a certain threshold value, and otherwise outputs 0 � 4

1958: Frank Rosenblatt’s Perceptron • A computational model of a single neuron • Solves a binary classification problem • Simple training algorithm • Built using specialized hardware F. Rosenblatt, “The perceptron: A probabilistic model for information storage and organization in the brain” , � 5 Psychological Review, Vol. 65, 1958

1969: Marvin Minsky and Seymour Papert “No machine can learn to recognize   X unless it possesses, at least   potentially, some scheme for   representing X.” (p. xiii) • Perceptrons can only represent   linearly separable functions. - such as XOR Problem   • Wrongly attributed as the reason   behind the AI winter , a period of   reduced funding and interest in AI research � 6

1990s • Multi-layer perceptrons can   theoretically learn any function   (Cybenko, 1989; Hornik, 1991) • Training multi-layer perceptrons - Back-propagation   (Rumelhart, Hinton, Williams, 1986) - Back-propagation through time (BPTT)   (Werbos, 1988) • New neural architectures - Convolutional neural nets (LeCun et al., 1989) - Long-short term memory networks (LSTM) (Schmidhuber, 1997) � 7

Why it failed then • Too many parameters to learn from few labeled examples. • “I know my features are better for this task”. • Non-convex optimization? No, thanks. • Black-box model, no interpretability. • Very slow and inefficient • Overshadowed by the success of SVMs   (Cortes and Vapnik, 1995) Adapted from Joan Bruna � 8

A major breakthrough in 2006 � 9

2006 Breakthrough:   Hinton and Salakhutdinov • The first solution to the vanishing gradient problem . • Build the model in a layer-by-layer fashion using unsupervised learning - The features in early layers are already initialized or “pretrained” with some suitable features (weights). - Pretrained features in early layers only need to be adjusted slightly during supervised learning to achieve good results. G. E. Hinton and R. R. Salakhutdinov, “Reducing the dimensionality of data with neural networks” ,   � 10 Science, Vol. 313, 28 July 2006.

The 2012 revolution � 11

ImageNet Challenge • Large Scale Visual Recognition Challenge (ILSVRC) - 1.2M training images with   Easiest classes 1K categories - Measure top-5 classification   error Hardest classes Output Output Scale Scale T-shirt T-shirt Steel drum Giant panda Drumstick Drumstick Mud turtle Mud turtle Image classification J. Deng, Wei Dong, R. Socher, L.-J. Li, K. Li and L. Fei-Fei , “ImageNet: A Large-Scale Hierarchical Image Database” , CVPR 2009. O. Russakovsky et al., “ImageNet Large Scale Visual Recognition Challenge” , Int. J. Comput. Vis.,, Vol. 115, Issue 3, pp 211-252, 2015. � 12

ILSVRC 2012 Competition 2012 Teams %Error Supervision 15.3 (Toronto) ISI (Tokyo) 26.1 VGG (Oxford) 26.9 XRCE/INRIA 27.0 UvA (Amsterdam) 29.6 • The success of AlexNet, a deep INRIA/LEAR 33.4 convolutional network - 7 hidden layers (not counting some max pooling layers) CNN based,   - 60M parameters non-CNN based • Combined several tricks - ReLU activation function, data augmentation, dropout A. Krizhevsky, I. Sutskever, G.E. Hinton “ImageNet Classification with Deep Convolutional Neural Networks” , NIPS 2012 � 13

2012 – now   A Cambrian explosion in deep learning � 14

Amodei et al., "Deep Speech 2: End-to-End _ Je suis étudiant Speech Recognition in English and Mandarin" , In CoRR 2015 M.-T. Luong et al., "Effective Approaches to Attention-based Neural Machine Translation" , _ I am a student Je suis étudiant Machine Translation EMNLP 2015 Speech recognition M. Bojarski et al., “End to End Learning for Self- Driving Cars” , In CoRR 2016 D. Silver et al., "Mastering the game of Go with deep neural networks and tree search" , Nature 529, 2016 Game Playing L. Pinto and A. Gupta, “Supersizing Self- supervision: Learning to Grasp from 50K Tries and 700 Robot Hours” ICRA 2015 Robotics H. Y. Xiong et al., "The human splicing code Audio Generation reveals new insights into the genetic determinants of disease" , Science 347, 2015 M. Ramona et al., "Capturing a Musician's Groove: Generation of Realistic Accompaniments from Single Song Recordings" , In IJCAI 2015 Self-Driving Cars And many more… Genomics 15

Why now? � 16

� 17 Slide credit: Neil Lawrence � 17

Datasets vs. Algorithms Year Breakthroughs in AI Datasets (First Available) Algorithms (First Proposed) 1994 Human-level spontaneous speech Spoken Wall Street Journal articles Hidden Markov Model recognition and other texts (1991) (1984) 1997 IBM Deep Blue defeated Garry 700,000 Grandmaster chess Negascout planning Kasparov games, aka “The Extended algorithm (1983) Book” (1991) 2005 Google’s Arabic-and Chinese-to- 1.8 trillion tokens from Google Web Statistical machine English translation and News pages (collected in 2005) translation algorithm (1988) 2011 IBM Watson became the world 8.6 million documents from Mixture-of-Experts Jeopardy! champion Wikipedia, Wiktionary, and Project (1991) Gutenberg (updated in 2010) 2014 Google’s GoogLeNet object ImageNet corpus of 1.5 million Convolutional Neural classification at near-human labeled images and 1,000 object Networks (1989) performance categories (2010) 2015 Google’s DeepMind achieved Arcade Learning Environment Q-learning (1992) human parity in playing 29 Atari dataset of over 50 Atari games games by learning general control (2013) from video Average No. of Years to Breakthrough: 3 years 18 years � 18 Table credit: Quant Quanto

Powerful Hardware GPU vs. CPU • CPU vs. GPU Slide credit: � 19

Slide credit: � 20 � 20

Working ideas on how to train deep architectures • Better Learning Regularization (e.g. Dropout ) N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, R. Salakhutdinov, “Dropout: A Simple Way to Prevent Neural Networks from Overfitting” ,   JMLR Vol. 15, No. 1, � 21

Working ideas on how to train deep architectures • Better Optimization Conditioning (e.g. Batch Normalization ) S. Ioffe, C. Szegedy, “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift” , In ICML 2015 � 22

Working ideas on how to train deep architectures • Better neural achitectures (e.g. Residual Nets ) K. He, X. Zhang, S. Ren, J. Sun, “Deep Residual Learning for Image Recognition” , In CVPR 2016 � 23

Let’s make a review   of neural networks � 24

The Perceptron sum non-linearity inputs weights x 0 w 0 x 1 w 1 w 2 x 2 ∑ w n … b x n 1 bias � 25

Perceptron Forward Pass non-linearity sum inputs weights • Neuron pre-activation   x 0 (or input activation)   w 0 x 1 w 1 i w i x i = b + w > x • a ( x ) = b + P w 2 x 2 ∑ P w n … • Neuron output activation: P • b x n • h ( x ) = g ( a ( x )) = g ( b + P i w i x i ) 1 bias where   w are the weights (parameters) b is the bias term   g(·) is called the activation function • � 26 •

Output Activation of The Neuron P • non-linearity sum inputs weights • h ( x ) = g ( a ( x )) = g ( b + P x 0 i w i x i ) w 0 x 1 w 1 w 2 x 2 ∑ w n … Range is determined by g (·) b x n 1 bias s ed Bias only changes the Bi position of the t ri ff ri (from Pascal Vincent’s slides) Image credit: Pascal Vincent � 27

Linear Activation Function P • non-linearity sum inputs weights • h ( x ) = g ( a ( x )) = g ( b + P x 0 i w i x i ) w 0 • { x 1 w 1 • g ( a ) = a tion w 2 x 2 ∑ w n … b x n 1 bias No nonlinear transformation No input squashing � 28

Sigmoid Activation Function P • non-linearity sum inputs weights • h ( x ) = g ( a ( x )) = g ( b + P x 0 i w i x i ) w 0 • x 1 w 1 1 • g ( a ) = sigm( a ) = s w 2 1+exp( � a ) x 2 ∑ • output between 0 and 1 � w n … output between 0 and 1 b x n 1 bias Squashes the neuron’s output between 0 and 1 Always positive Bounded Strictly Increasing � 29

  Hyperbolic Tangent (tanh) Activation Function P • non-linearity sum inputs weights • h ( x ) = g ( a ( x )) = g ( b + P x 0 i w i x i ) w 0 • g ( a ) = tanh( a ) = exp( a ) � exp( � a ) x 1 w 1 exp( a )+exp( � a ) w 2 h( a ) = exp( a ) � exp( � a ) exp( a )+exp( � a ) = exp(2 a ) � 1 x 2 ∑ exp(2 a )+1 w n … b x n 1 bias Squashes the neuron’s output between   -1 and 1 Can be positive or negative Bounded Strictly Increasing � 30 =

BBM413 Fundamentals of Image Processing Introduction to Deep - PowerPoint PPT Presentation

BBM413 Fundamentals of Image Processing Introduction to Deep Learning Erkut Erdem Hacettepe University Computer Vision Lab (HUCVL) What is deep learning? Y. LeCun, Y. Bengio, G. Hinton, "Deep Learning" , Nature,

Introduction: What is Image Processing? CS 4640: Image Processing Basics January 10, 2012 What

BBM 413 Today Fundamentals of What is image processing? Image Processing What does it

Digital Image Fundamentals and Image Acquisition 1/18/2011 1 Image Acquisition 1/18/2011 2 1

Image Restoration Image Enhancement and Image Restoration both deal with improving images. Image

Image Processing Todays Class Image Representations: Matrices Image Representations: RGB,

Image Processing Tricks in Image Processing Tricks in OpenGL OpenGL Simon Green Simon Green

Image Processing CS 110 Why Image Processing? Medical Images

Color image processing The use of color in image processing is primarily motivated by two Image

Introduction to Digital Image Processing Asim Banerjee IEEE Workshop on Image Processing. 1 st

Lecture 1 Introduction Objectives Digital image processing, Why? Scope of digital image

Image Compression Image Compression Fundamentals Fundamentals Advisor: Prof. Andy Wu 2004/12/9

Image restoration IMAGE P ROCES S IN G IN P YTH ON Rebeca Gonzalez Data Engineer Restore an

Image Transforma1ons image filtering : change range of image Image Processing : g(x) =

David Tschumperl Image Team, GREYC / CNRS (UMR 6072) IPOL Workshop on Image Processing

CCD Image Processing: CCD Image Processing: [ ] [ ] r x y , d x y , Raw File [ ]

CCD Image Processing: CCD Image Processing: Issues & Solutions Issues & Solutions 1

Bayesian Optimization and Automated Machine Learning Jungtaek Kim (jtkim@postech.ac.kr) Machine

How to begin a TED talk smile emphasise points with both hands near my head use an

! e Power of Handw " ting An appeal for handwriting skills in a modern world ! e I st ue ! e

Configurations and Optimizations of TDMA Schedules for Periodic Packet Communication on Networks

Quantum LerayHirsch Chin-Lung Wang National Taiwan University 2011, December 4 Pacific Rim

Organization 2 Review of some key concepts from the first half of the semester A BRIEF SUMMARY

Activities of the ICTP Telecommunications/ICT for development Laboratory S. M. Radicella Head,

Local layered algorithmic model for topological design of rural telecommunications networks

BBM413 Fundamentals of Image Processing Introduction to Deep - PowerPoint PPT Presentation

BBM413 Fundamentals of Image Processing Introduction to Deep Learning Erkut Erdem Hacettepe University Computer Vision Lab (HUCVL) What is deep learning? Y. LeCun, Y. Bengio, G. Hinton, "Deep Learning" , Nature,

Introduction: What is Image Processing? CS 4640: Image Processing Basics January 10, 2012 What

BBM 413 Today Fundamentals of What is image processing? Image Processing What does it

Digital Image Fundamentals and Image Acquisition 1/18/2011 1 Image Acquisition 1/18/2011 2 1

Image Restoration Image Enhancement and Image Restoration both deal with improving images. Image

Image Processing Todays Class Image Representations: Matrices Image Representations: RGB,

Image Processing Tricks in Image Processing Tricks in OpenGL OpenGL Simon Green Simon Green

Image Processing CS 110 Why Image Processing? Medical Images

Color image processing The use of color in image processing is primarily motivated by two Image

Introduction to Digital Image Processing Asim Banerjee IEEE Workshop on Image Processing. 1 st

Lecture 1 Introduction Objectives Digital image processing, Why? Scope of digital image

Image Compression Image Compression Fundamentals Fundamentals Advisor: Prof. Andy Wu 2004/12/9

Image restoration IMAGE P ROCES S IN G IN P YTH ON Rebeca Gonzalez Data Engineer Restore an

Image Transforma1ons image filtering : change range of image Image Processing : g(x) =

David Tschumperl Image Team, GREYC / CNRS (UMR 6072) IPOL Workshop on Image Processing

CCD Image Processing: CCD Image Processing: [ ] [ ] r x y , d x y , Raw File [ ]

CCD Image Processing: CCD Image Processing: Issues &amp; Solutions Issues &amp; Solutions 1

Bayesian Optimization and Automated Machine Learning Jungtaek Kim (jtkim@postech.ac.kr) Machine

How to begin a TED talk smile emphasise points with both hands near my head use an

! e Power of Handw &quot; ting An appeal for handwriting skills in a modern world ! e I st ue ! e

Configurations and Optimizations of TDMA Schedules for Periodic Packet Communication on Networks

Quantum LerayHirsch Chin-Lung Wang National Taiwan University 2011, December 4 Pacific Rim

Organization 2 Review of some key concepts from the first half of the semester A BRIEF SUMMARY

Activities of the ICTP Telecommunications/ICT for development Laboratory S. M. Radicella Head,

Local layered algorithmic model for topological design of rural telecommunications networks

CCD Image Processing: CCD Image Processing: Issues & Solutions Issues & Solutions 1

! e Power of Handw " ting An appeal for handwriting skills in a modern world ! e I st ue ! e