cs7015 deep learning lecture 1
play

CS7015 (Deep Learning) : Lecture 1 (Partial/Brief) History of Deep - PowerPoint PPT Presentation

CS7015 (Deep Learning) : Lecture 1 (Partial/Brief) History of Deep Learning Mitesh M. Khapra Department of Computer Science and Engineering Indian Institute of Technology Madras 1/49 Acknowledgements Most of this material is based on the


  1. CS7015 (Deep Learning) : Lecture 1 (Partial/Brief) History of Deep Learning Mitesh M. Khapra Department of Computer Science and Engineering Indian Institute of Technology Madras 1/49

  2. Acknowledgements Most of this material is based on the article “Deep Learning in Neural Networks: An Overview” by J. Schmidhuber [1] The errors, if any, are due to me and I apologize for them Feel free to contact me if you think certain portions need to be corrected (please provide appropriate references) 2/49 Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 1

  3. Chapter 1: Biological Neurons 3/49 Module 1.1

  4. Reticular Theory Joseph von Gerlach proposed that the ner- vous system is a single continuous network as opposed to a network of many discrete cells! 1871-1873 Reticular theory 4/49 Module 1.1

  5. Staining Technique Camillo Golgi discovered a chemical reaction that allowed him to examine nervous tissue in much greater detail than ever before He was a proponent of Reticular theory. 1871-1873 Reticular theory 4/49 Module 1.1

  6. Neuron Doctrine Santiago Ram´ on y Cajal used Golgi’s tech- nique to study the nervous system and pro- posed that it is actually made up of discrete individual cells formimg a network (as op- posed to a single continuous network) 1871-1873 1888-1891 Reticular theory Neuron Doctrine 4/49 Module 1.1

  7. The Term Neuron The term neuron was coined by Hein- rich Wilhelm Gottfried von Waldeyer-Hartz around 1891. He further consolidated the Neuron Doc- trine. 1871-1873 1888-1891 Reticular theory Neuron Doctrine 4/49 Module 1.1

  8. Nobel Prize Both Golgi (reticular theory) and Cajal (neu- ron doctrine) were jointly awarded the 1906 Nobel Prize for Physiology or Medicine, that resulted in lasting conflicting ideas and con- troversies between the two scientists. 1871-1873 1888-1891 1906 Reticular theory Neuron Doctrine Nobel Prize 4/49 Module 1.1

  9. The Final Word In 1950s electron microscopy finally con- firmed the neuron doctrine by unam- biguously demonstrating that nerve cells were individual cells interconnected through synapses (a network of many individual neu- rons). 1871-1873 1888-1891 1906 1950 Reticular theory Neuron Doctrine Nobel Prize Synapse 4/49 Module 1.1

  10. Chapter 2: From Spring to Winter of AI 5/49 Module 2

  11. McCulloch Pitts Neuron McCulloch (neuroscientist) and Pitts (logi- cian) proposed a highly simplified model of the neuron (1943) [2] 1943 MP Neuron 6/49 Module 2

  12. Perceptron “the perceptron may eventually be able to learn, make decisions, and translate lan- guages” -Frank Rosenblatt 1943 1957-1958 MP Neuron Perceptron 6/49 Module 2

  13. Perceptron “the embryo of an electronic computer that the Navy expects will be able to walk, talk, see, write, reproduce itself and be conscious of its existence.” -New York Times 1943 1957-1958 MP Neuron Perceptron 6/49 Module 2

  14. First generation Multilayer Perceptrons Ivakhnenko et. al. [3] 1943 1957-1958 1965-1968 MP Neuron Perceptron MLP 6/49 Module 2

  15. Perceptron Limitations In their now famous book “Perceptrons”, Minsky and Papert outlined the limits of what perceptrons could do [4] 1943 1957-1958 1965-1968 1969 MP Neuron Perceptron MLP Limitations 6/49 Module 2

  16. AI Winter of connectionism Almost lead to the abandonment of connec- tionist AI 1943 1957-1958 1965-1968 1969 1969-1986 MP Neuron Perceptron MLP Limitations AI Winter 6/49 Module 2

  17. Backpropagation Discovered and rediscovered several times throughout 1960’s and 1970’s Werbos(1982) [5] first used it in the context of artificial neural networks Eventually popularized by the work of Rumelhart et. al. in 1986 [6] 1943 1957-1958 1965-1968 1969 1969-1986 1986 MP Neuron Perceptron MLP Limitations AI Winter Backpropagation 6/49 Module 2

  18. Gradient Descent Cauchy discovered Gradient Descent moti- vated by the need to compute the orbit of heavenly bodies 1847 1943 1957-1958 1965-1968 1969 1969-1986 1986 Gradient Descent MP Neuron Perceptron MLP Limitations AI Winter Backpropagation 6/49 Module 2

  19. Universal Approximation The- orem A multilayered network of neurons with a single hidden layer can be used to approxi- mate any continuous function to any desired precision [7] 1847 1943 1957-1958 1965-1968 1969 1969-1986 1986 1989 Gradient Descent MP Neuron Perceptron MLP Limitations AI Winter UAT Backpropagation 6/49 Module 2

  20. Chapter 3: The Deep Revival 7/49 Module 3

  21. Unsupervised Pre-Training Hinton and Salakhutdinov described an ef- fective way of initializing the weights that allows deep autoencoder networks to learn a low-dimensional representation of data. [8] 2006 Unsupervised Pre-Training 8/49 Module 3

  22. Unsupervised Pre-Training The idea of unsupervised pre-training actu- ally dates back to 1991-1993 (J. Schmidhu- ber) when it was used to train a “Very Deep Learner” 1991-1993 2006 Unsupervised Pre-Training Very Deep Learner 9/49 Module 3

  23. More insights (2007-2009) Further Investigations into the effectiveness of Unsupervised Pre-training 1991-1993 2006-2009 Unsupervised Pretraining Very Deep Learner 9/49 Module 3

  24. Success in Handwriting Recog- nition Graves et. al. outperformed all entries in an international Arabic handwriting recognition competition [9] 1991-1993 2006-2009 2009 Handwriting Unsupervised Pretraining Very Deep Learner 9/49 Module 3

  25. Success in Speech Recognition Dahl et. al. showed relative error reduction of 16.0% and 23.2% over a state of the art system [10] 1991-1993 2006-2009 2009 2010 Handwriting Speech Unsupervised Pretraining Very Deep Learner 9/49 Module 3

  26. New record on MNIST Ciresan et. al. set a new record on the MNIST dataset using good old backpropa- gation on GPUs (GPUs enter the scene) [11] 1991-1993 2006-2009 2009 2010 Handwriting Speech Unsupervised Pretraining Very Deep Learner Record on MNIST 9/49 Module 3

  27. First Superhuman Visual Pat- tern Recognition D. C. Ciresan et. al. achieved 0.56% error rate in the IJCNN Traffic Sign Recognition Competition [12] 1991-1993 2006-2009 2009 2010 2011 Handwriting Speech Unsupervised Pretraining Very Deep Learner Record on MNIST Visual Pattern Recognition 9/49 Module 3

  28. Winning more visual recogni- tion challenges Network Error Layers AlexNet [13] 16.0% 8 1991-1993 2006-2009 2009 2010 2011 2012-2016 Handwriting Speech Unsupervised Pretraining Very Deep Learner Record on MNIST Success on ImageNet Visual Pattern Recognition 9/49 Module 3

  29. Winning more visual recogni- tion challenges Network Error Layers AlexNet [13] 16.0% 8 ZFNet [14] 11.2% 8 1991-1993 2006-2009 2009 2010 2011 2012-2016 Handwriting Speech Unsupervised Pretraining Very Deep Learner Record on MNIST Success on ImageNet Visual Pattern Recognition 9/49 Module 3

  30. Winning more visual recogni- tion challenges Network Error Layers AlexNet [13] 16.0% 8 ZFNet [14] 11.2% 8 VGGNet [15] 7.3% 19 1991-1993 2006-2009 2009 2010 2011 2012-2016 Handwriting Speech Unsupervised Pretraining Very Deep Learner Record on MNIST Success on ImageNet Visual Pattern Recognition 9/49 Module 3

  31. Winning more visual recogni- tion challenges Network Error Layers AlexNet [13] 16.0% 8 ZFNet [14] 11.2% 8 VGGNet [15] 7.3% 19 GoogLeNet [16] 6.7% 22 1991-1993 2006-2009 2009 2010 2011 2012-2016 Handwriting Speech Unsupervised Pretraining Very Deep Learner Record on MNIST Success on ImageNet Visual Pattern Recognition 9/49 Module 3

  32. Winning more visual recogni- tion challenges Network Error Layers AlexNet [13] 16.0% 8 ZFNet [14] 11.2% 8 VGGNet [15] 7.3% 19 GoogLeNet [16] 6.7% 22 MS ResNet [17] 3.6% 152!! 1991-1993 2006-2009 2009 2010 2011 2012-2016 Handwriting Speech Unsupervised Pretraining Very Deep Learner Record on MNIST Success on ImageNet Visual Pattern Recognition 9/49 Module 3

  33. Chapter 4: From Cats to Convolutional Neural Networks 10/49 Module 4

  34. Hubel and Wiesel Experiment Experimentally showed that each neuron has a fixed receptive field - i.e. a neuron will fire only in response to a visual stimuli in a specific region in the visual space [18] 1959 H and W experiment 11/49 Module 4

  35. Neocognitron Used for Handwritten character recogni- tion and pattern recognition (Fukushima et. al.) [19] 1959 1980 H and W experiment Neocognitron 11/49 Module 4

  36. Convolutional Neural Network Handwriting digit recognition using back- propagation over a Convolutional Neural Network (LeCun et. al.) [20] 1959 1980 1989 H and W experiment Neocognitron CNN 11/49 Module 4

  37. LeNet-5 Introduced the (now famous) MNIST dataset (LeCun et. al.) [21] 1959 1980 1989 1998 H and W experiment Neocognitron CNN LeNet-5 11/49 Module 4

  38. An algorithm inspired by an experiment on cats is today used to detect cats in videos :-) 12/49 Module 4

  39. Chapter 5: Faster, higher, stronger 13/49 Module 5

  40. Better Optimization Methods Faster convergence, better accuracies 1983 Nesterov 14/49 Module 5

  41. Better Optimization Methods Faster convergence, better accuracies 1983 2011 Adagrad Nesterov 14/49 Module 5

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend