ECE 6504: Deep Learning for Perception Topics: (Finish) Backprop - PowerPoint PPT Presentation

ECE 6504: Deep Learning for Perception Topics: – (Finish) Backprop – Convolutional Neural Nets Dhruv Batra Virginia Tech

Administrativia • Presentation Assignments – https://docs.google.com/spreadsheets/d/ 1m76E4mC0wfRjc4HRBWFdAlXKPIzlEwfw1-u7rBw9TJ8/ edit#gid=2045905312 (C) Dhruv Batra 2

Recap of last time (C) Dhruv Batra 3

Last Time • Notation + Setup • Neural Networks • Chain Rule + Backprop (C) Dhruv Batra 4

Recall: The Neuron Metaphor • Neurons • accept information from multiple inputs, • transmit information to other neurons. • Artificial neuron • Multiply inputs by weights along edges • Apply some function to the set of inputs at each node 5 Image Credit: Andrej Karpathy, CS231n

Activation Functions • sigmoid vs tanh (C) Dhruv Batra 6

A quick note (C) Dhruv Batra Image Credit: LeCun et al. ‘98 7

Rectified Linear Units (ReLU) (C) Dhruv Batra 8

(C) Dhruv Batra 9

(C) Dhruv Batra 10

Visualizing Loss Functions • Sum of individual losses (C) Dhruv Batra 11 Image Credit: Andrej Karpathy, CS231n

Detour (C) Dhruv Batra 12

Logistic Regression as a Cascade | x | x w | x | x (C) Dhruv Batra 13 Slide Credit: Marc'Aurelio Ranzato, Yann LeCun

Key Computation: Forward-Prop (C) Dhruv Batra 14 Slide Credit: Marc'Aurelio Ranzato, Yann LeCun

Key Computation: Back-Prop (C) Dhruv Batra 15 Slide Credit: Marc'Aurelio Ranzato, Yann LeCun

Plan for Today • MLPs – Notation – Backprop • CNNs – Notation – Convolutions – Forward pass – Backward pass (C) Dhruv Batra 16

Multilayer Networks • Cascade Neurons together • The output from one layer is the input to the next • Each Layer has its own sets of weights (C) Dhruv Batra 17 Image Credit: Andrej Karpathy, CS231n

Equivalent Representations (C) Dhruv Batra 18 Slide Credit: Marc'Aurelio Ranzato, Yann LeCun

Backward Propagation Question: Does BPROP work with ReLU layers only? Answer: Nope, any a.e. differentiable transformation works. Question: What's the computational cost of BPROP? Answer: About twice FPROP (need to compute gradients w.r.t. input and parameters at every layer). Note: FPROP and BPROP are dual of each other. E.g.,: FPROP BPROP SUM + COPY + (C) Dhruv Batra 19 Slide Credit: Marc'Aurelio Ranzato, Yann LeCun

Fully Connected Layer Example: 200x200 image 40K hidden units ~2B parameters !!! - Spatial correlation is local - Waste of resources + we have not enough training samples anyway.. 20 Slide Credit: Marc'Aurelio Ranzato

Locally Connected Layer Example: 200x200 image 40K hidden units Filter size: 10x10 4M parameters Note: This parameterization is good when input image is registered (e.g., face recognition). 21 Slide Credit: Marc'Aurelio Ranzato

Locally Connected Layer STATIONARITY? Statistics is similar at different locations Example: 200x200 image 40K hidden units Filter size: 10x10 4M parameters Note: This parameterization is good when input image is registered (e.g., face recognition). 22 Slide Credit: Marc'Aurelio Ranzato

Convolutional Layer Share the same parameters across different locations (assuming input is stationary): Convolutions with learned kernels 23 Slide Credit: Marc'Aurelio Ranzato

"Convolution of box signal with itself2" by Convolution_of_box_signal_with_itself.gif: Brian Ambergderivative work: Tinos (talk) - Convolution_of_box_signal_with_itself.gif. Licensed under CC BY-SA 3.0 via Commons - https://commons.wikimedia.org/ wiki/File:Convolution_of_box_signal_with_itself2.gif#/media/File:Convolution_of_box_signal_with_itself2.gif (C) Dhruv Batra 24

Convolution Explained • http://setosa.io/ev/image-kernels/ • https://github.com/bruckner/deepViz (C) Dhruv Batra 25

Convolutional Layer (C) Dhruv Batra 26 Slide Credit: Marc'Aurelio Ranzato

Convolutional Layer Mathieu et al. “Fast training of CNNs through FFTs” ICLR 2014 (C) Dhruv Batra 41 Slide Credit: Marc'Aurelio Ranzato

Convolutional Layer -1 0 1 = * -1 0 1 -1 0 1 (C) Dhruv Batra 42 Slide Credit: Marc'Aurelio Ranzato

Convolutional Layer Learn multiple filters. E.g.: 200x200 image 100 Filters Filter size: 10x10 10K parameters (C) Dhruv Batra 43 Slide Credit: Marc'Aurelio Ranzato

Convolutional Nets a C3: f. maps 16@10x10 C1: feature maps S4: f. maps 16@5x5 INPUT 6@28x28 32x32 S2: f. maps C5: layer OUTPUT F6: layer 6@14x14 120 10 84 Gaussian connections Full connection Subsampling Subsampling Convolutions Convolutions Full connection (C) Dhruv Batra Image Credit: Yann LeCun, Kevin Murphy 44

Convolutional Layer 8 9 #input channels < = X h n − 1 h n ∗ w n i = max : 0 , ij j ; j =1 output input feature kernel feature map map Conv. n − 1 n h 1 h 1 layer n − 1 h 2 n h 2 n − 1 h 3 (C) Dhruv Batra 45 Slide Credit: Marc'Aurelio Ranzato

Convolutional Layer 8 9 #input channels < = X h n − 1 h n ∗ w n i = max : 0 , ij j ; j =1 output input feature kernel feature map map n − 1 n h 1 h 1 n − 1 h 2 n h 2 n − 1 h 3 (C) Dhruv Batra 46 Slide Credit: Marc'Aurelio Ranzato

Convolutional Layer 8 9 #input channels < = X h n − 1 h n ∗ w n i = max : 0 , ij j ; j =1 output input feature kernel feature map map n − 1 n h 1 h 1 n − 1 h 2 n h 2 n − 1 h 3 (C) Dhruv Batra 47 Slide Credit: Marc'Aurelio Ranzato

Convolutional Layer Question: What is the size of the output? What's the computational cost? Answer: It is proportional to the number of filters and depends on the stride. If kernels have size KxK, input has size DxD, stride is 1, and there are M input feature maps and N output feature maps then: - the input has size M@DxD - the output has size N@(D-K+1)x(D-K+1) - the kernels have MxNxKxK coefficients (which have to be learned) - cost: M*K*K*N*(D-K+1)*(D-K+1) Question: How many feature maps? What's the size of the filters? Answer: Usually, there are more output feature maps than input feature maps. Convolutional layers can increase the number of hidden units by big factors (and are expensive to compute). The size of the filters has to match the size/scale of the patterns we want to detect (task dependent). (C) Dhruv Batra 48 Slide Credit: Marc'Aurelio Ranzato

Key Ideas A standard neural net applied to images: - scales quadratically with the size of the input - does not leverage stationarity Solution: - connect each hidden unit to a small patch of the input - share the weight across space This is called: convolutional layer. A network with convolutional layers is called convolutional network. LeCun et al. “Gradient-based learning applied to document recognition” IEEE 1998 (C) Dhruv Batra 49 Slide Credit: Marc'Aurelio Ranzato

Pooling Layer Let us assume filter is an “eye” detector. Q.: how can we make the detection robust to the exact location of the eye? (C) Dhruv Batra 50 Slide Credit: Marc'Aurelio Ranzato

Pooling Layer By “pooling” (e.g., taking max) filter responses at different locations we gain robustness to the exact spatial location of features. (C) Dhruv Batra 51 Slide Credit: Marc'Aurelio Ranzato

Pooling Layer: Examples Max-pooling: c ∈ N ( c ) h n − 1 h n i ( r, c ) = max (¯ r, ¯ c ) i r ∈ N ( r ) , ¯ ¯ Average-pooling: c ∈ N ( c ) h n − 1 h n i ( r, c ) = mean (¯ r, ¯ c ) i r ∈ N ( r ) , ¯ ¯ L2-pooling: s X h n − 1 h n c ) 2 i ( r, c ) = (¯ r, ¯ i r ∈ N ( r ) , ¯ ¯ c ∈ N ( c ) L2-pooling over features: s X h n − 1 h n ( r, c ) 2 i ( r, c ) = i j ∈ N ( i ) (C) Dhruv Batra 52 Slide Credit: Marc'Aurelio Ranzato

ECE 6504: Deep Learning for Perception Topics: (Finish) Backprop - PowerPoint PPT Presentation

ECE 6504: Deep Learning for Perception Topics: (Finish) Backprop Convolutional Neural Nets Dhruv Batra Virginia Tech Administrativia Presentation Assignments https://docs.google.com/spreadsheets/d/

ECE 6504: Deep Learning for Perception Topics: LSTMs (intuition and variants) [Abhishek:]

ECE 6504: Deep Learning for Perception Topics: Recurrent Neural Networks (RNNs) BackProp

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale

Visual Perception human perception display devices 1 CS 349 - Visual Perception Reference

PLAYING ATARI WITH DEEP REINFORCEMENT LEARNING NEURAL NETWORK VISION FOR ROBOT DRIVING ARJUN

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale

MODULES AS PERCEPTUAL INPUT - SYSTEMS Language Perception Visual Auditory Perception

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

(Deep) Learning for Robot Perception and Navigation Wolfram Burgard Deep Learning for Robot

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

For New Construction & Ship Repair PERCEPTION ESTI-MATE PERCEPTION ESTI-MATE 1 PERCEPTION

everything you ever wanted to know about missjessicatate jessicatate What is a jpg? Joint

Java Swing GUI Programming 4 Learning Objectives New UI components File chooser, Text

In-class Presentations Another skill to learn Like designing game, coding, art A

Data Visualization Principles: Other Perceptual Channels CSC544 Acknowledgments for todays

Choropleths Data Visualization with ggplot2 Chapter Contents Maps GIS = Geographic

AE-705: Introduction to Flight Bernoulli, Coand & Mach Three Giants of Fluid Mechanics

innovate like crazy Peter Gfader twitter.com/ peitor Your customer? Image source:

RatSLAM: A Bio-inspired Approach to Robot Navigation Phil Bradfield University of Hamburg

ECE 6504: Deep Learning for Perception Topics: (Finish) Backprop - PowerPoint PPT Presentation

ECE 6504: Deep Learning for Perception Topics: (Finish) Backprop Convolutional Neural Nets Dhruv Batra Virginia Tech Administrativia Presentation Assignments https://docs.google.com/spreadsheets/d/

ECE 6504: Deep Learning for Perception Topics: LSTMs (intuition and variants) [Abhishek:]

ECE 6504: Deep Learning for Perception Topics: Recurrent Neural Networks (RNNs) BackProp

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale

Visual Perception human perception display devices 1 CS 349 - Visual Perception Reference

PLAYING ATARI WITH DEEP REINFORCEMENT LEARNING NEURAL NETWORK VISION FOR ROBOT DRIVING ARJUN

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale

MODULES AS PERCEPTUAL INPUT - SYSTEMS Language Perception Visual Auditory Perception

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

(Deep) Learning for Robot Perception and Navigation Wolfram Burgard Deep Learning for Robot

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

For New Construction &amp; Ship Repair PERCEPTION ESTI-MATE PERCEPTION ESTI-MATE 1 PERCEPTION

everything you ever wanted to know about missjessicatate jessicatate What is a jpg? Joint

Java Swing GUI Programming 4 Learning Objectives New UI components File chooser, Text

In-class Presentations Another skill to learn Like designing game, coding, art A

Data Visualization Principles: Other Perceptual Channels CSC544 Acknowledgments for todays

Choropleths Data Visualization with ggplot2 Chapter Contents Maps GIS = Geographic

AE-705: Introduction to Flight Bernoulli, Coand &amp; Mach Three Giants of Fluid Mechanics

innovate like crazy Peter Gfader twitter.com/ peitor Your customer? Image source:

RatSLAM: A Bio-inspired Approach to Robot Navigation Phil Bradfield University of Hamburg

For New Construction & Ship Repair PERCEPTION ESTI-MATE PERCEPTION ESTI-MATE 1 PERCEPTION

AE-705: Introduction to Flight Bernoulli, Coand & Mach Three Giants of Fluid Mechanics