CSC 411: Lecture 11: Neural Networks II Class based on Raquel - PowerPoint PPT Presentation

CSC 411: Lecture 11: Neural Networks II Class based on Raquel Urtasun & Rich Zemel’s lectures Sanja Fidler University of Toronto March 2, 2016 Urtasun, Zemel, Fidler (UofT) CSC 411: 11-Neural Networks II March 2, 2016 1 / 55

Today Deep learning for Object Recognition Urtasun, Zemel, Fidler (UofT) CSC 411: 11-Neural Networks II March 2, 2016 2 / 55

Neural Nets for Object Recognition People are very good at recognizing shapes Urtasun, Zemel, Fidler (UofT) CSC 411: 11-Neural Networks II March 2, 2016 3 / 55

Neural Nets for Object Recognition People are very good at recognizing shapes I Intrinsically di ffi cult, computers are bad at it Urtasun, Zemel, Fidler (UofT) CSC 411: 11-Neural Networks II March 2, 2016 3 / 55

Neural Nets for Object Recognition People are very good at recognizing shapes I Intrinsically di ffi cult, computers are bad at it Why is it di ffi cult? Urtasun, Zemel, Fidler (UofT) CSC 411: 11-Neural Networks II March 2, 2016 3 / 55

Why is it a Problem? Di ffi cult scene conditions [From: Grauman & Leibe] Urtasun, Zemel, Fidler (UofT) CSC 411: 11-Neural Networks II March 2, 2016 4 / 55

Why is it a Problem? Huge within-class variations. Recognition is mainly about modeling variation. [Pic from: S. Lazebnik] Urtasun, Zemel, Fidler (UofT) CSC 411: 11-Neural Networks II March 2, 2016 5 / 55

Why is it a Problem? Tones of classes [Biederman] Urtasun, Zemel, Fidler (UofT) CSC 411: 11-Neural Networks II March 2, 2016 6 / 55

Neural Nets for Object Recognition People are very good at recognizing shapes I Intrinsically di ffi cult, computers are bad at it Some reasons why it is di ffi cult: Urtasun, Zemel, Fidler (UofT) CSC 411: 11-Neural Networks II March 2, 2016 7 / 55

Neural Nets for Object Recognition People are very good at recognizing shapes I Intrinsically di ffi cult, computers are bad at it Some reasons why it is di ffi cult: I Segmentation: Real scenes are cluttered Urtasun, Zemel, Fidler (UofT) CSC 411: 11-Neural Networks II March 2, 2016 7 / 55

Neural Nets for Object Recognition People are very good at recognizing shapes I Intrinsically di ffi cult, computers are bad at it Some reasons why it is di ffi cult: I Segmentation: Real scenes are cluttered I Invariances: We are very good at ignoring all sorts of variations that do not a ff ect shape Urtasun, Zemel, Fidler (UofT) CSC 411: 11-Neural Networks II March 2, 2016 7 / 55

Neural Nets for Object Recognition People are very good at recognizing shapes I Intrinsically di ffi cult, computers are bad at it Some reasons why it is di ffi cult: I Segmentation: Real scenes are cluttered I Invariances: We are very good at ignoring all sorts of variations that do not a ff ect shape I Deformations: Natural shape classes allow variations (faces, letters, chairs) Urtasun, Zemel, Fidler (UofT) CSC 411: 11-Neural Networks II March 2, 2016 7 / 55

Neural Nets for Object Recognition People are very good at recognizing shapes I Intrinsically di ffi cult, computers are bad at it Some reasons why it is di ffi cult: I Segmentation: Real scenes are cluttered I Invariances: We are very good at ignoring all sorts of variations that do not a ff ect shape I Deformations: Natural shape classes allow variations (faces, letters, chairs) I A huge amount of computation is required Urtasun, Zemel, Fidler (UofT) CSC 411: 11-Neural Networks II March 2, 2016 7 / 55

How to Deal with Large Input Spaces How can we apply neural nets to images? Urtasun, Zemel, Fidler (UofT) CSC 411: 11-Neural Networks II March 2, 2016 8 / 55

How to Deal with Large Input Spaces How can we apply neural nets to images? Images can have millions of pixels, i.e., x is very high dimensional Urtasun, Zemel, Fidler (UofT) CSC 411: 11-Neural Networks II March 2, 2016 8 / 55

How to Deal with Large Input Spaces How can we apply neural nets to images? Images can have millions of pixels, i.e., x is very high dimensional How many parameters do I have? Urtasun, Zemel, Fidler (UofT) CSC 411: 11-Neural Networks II March 2, 2016 8 / 55

How to Deal with Large Input Spaces How can we apply neural nets to images? Images can have millions of pixels, i.e., x is very high dimensional How many parameters do I have? Prohibitive to have fully-connected layers Urtasun, Zemel, Fidler (UofT) CSC 411: 11-Neural Networks II March 2, 2016 8 / 55

How to Deal with Large Input Spaces How can we apply neural nets to images? Images can have millions of pixels, i.e., x is very high dimensional How many parameters do I have? Prohibitive to have fully-connected layers What can we do? Urtasun, Zemel, Fidler (UofT) CSC 411: 11-Neural Networks II March 2, 2016 8 / 55

How to Deal with Large Input Spaces How can we apply neural nets to images? Images can have millions of pixels, i.e., x is very high dimensional How many parameters do I have? Prohibitive to have fully-connected layers What can we do? We can use a locally connected layer Urtasun, Zemel, Fidler (UofT) CSC 411: 11-Neural Networks II March 2, 2016 8 / 55

Locally Connected Layer Example: 200x200 image 40K hidden units Filter size: 10x10 4M parameters Note: This parameterization is good when input image is registered (e.g., 34 face recognition). Ra Ranzato Urtasun, Zemel, Fidler (UofT) CSC 411: 11-Neural Networks II March 2, 2016 9 / 55

When Will this Work? When Will this Work? Urtasun, Zemel, Fidler (UofT) CSC 411: 11-Neural Networks II March 2, 2016 10 / 55

When Will this Work? When Will this Work? This is good when the input is (roughly) registered Urtasun, Zemel, Fidler (UofT) CSC 411: 11-Neural Networks II March 2, 2016 10 / 55

General Images The object can be anywhere [Slide: Y. Zhu] Urtasun, Zemel, Fidler (UofT) CSC 411: 11-Neural Networks II March 2, 2016 11 / 55

Locally Connected Layer STATIONARITY? Statistics is similar at different locations Example: 200x200 image 40K hidden units Filter size: 10x10 4M parameters Note: This parameterization is good when input image is registered (e.g., 35 face recognition). Ranzato Ra Urtasun, Zemel, Fidler (UofT) CSC 411: 11-Neural Networks II March 2, 2016 14 / 55

The replicated feature approach Adopt approach apparently used in monkey visual systems The red connections all Use many di ff erent copies of the same have the same weight. feature detector. I Copies have slightly di ff erent positions. I Could also replicate across scale and orientation. I Tricky and expensive I Replication reduces number of free parameters to be learned. Use several di ff erent feature types, each with its own replicated pool of detectors. 5 I Allows each patch of image to be represented in several ways. Urtasun, Zemel, Fidler (UofT) CSC 411: 11-Neural Networks II March 2, 2016 15 / 55

Convolutional Neural Net Idea: statistics are similar at di ff erent locations (Lecun 1998) Connect each hidden unit to a small input patch and share the weight across space This is called a convolution layer and the network is a convolutional network Urtasun, Zemel, Fidler (UofT) CSC 411: 11-Neural Networks II March 2, 2016 16 / 55

Convolutional Layer Ra Ranzato K h n X h n − 1 ∗ w n j = max(0 , jk ) k k =1 Urtasun, Zemel, Fidler (UofT) CSC 411: 11-Neural Networks II March 2, 2016 17 / 55

Convolutional Layer Ra Ranzato K X h n − 1 h n ∗ w n j = max(0 , jk ) k k =1 Urtasun, Zemel, Fidler (UofT) CSC 411: 11-Neural Networks II March 2, 2016 18 / 55

Convolutional Layer Learn multiple filters. E.g.: 200x200 image 100 Filters Filter size: 10x10 10K parameters 54 Ra Ranzato Urtasun, Zemel, Fidler (UofT) CSC 411: 11-Neural Networks II March 2, 2016 23 / 55

Convolutional Layer Figure: Left: CNN, right: Each neuron computes a linear and activation function Urtasun, Zemel, Fidler (UofT) CSC 411: 11-Neural Networks II March 2, 2016 24 / 55

Convolutional Layer Figure: Left: CNN, right: Each neuron computes a linear and activation function Hyperparameters of a convolutional layer: Urtasun, Zemel, Fidler (UofT) CSC 411: 11-Neural Networks II March 2, 2016 24 / 55

Convolutional Layer Figure: Left: CNN, right: Each neuron computes a linear and activation function Hyperparameters of a convolutional layer: The number of filters (controls the depth of the output volume) Urtasun, Zemel, Fidler (UofT) CSC 411: 11-Neural Networks II March 2, 2016 24 / 55

CSC 411: Lecture 11: Neural Networks II Class based on Raquel - PowerPoint PPT Presentation

CSC 411: Lecture 11: Neural Networks II Class based on Raquel Urtasun & Rich Zemels lectures Sanja Fidler University of Toronto March 2, 2016 Urtasun, Zemel, Fidler (UofT) CSC 411: 11-Neural Networks II March 2, 2016 1 / 55 Today

CSC 411: Lecture 10: Neural Networks I Class based on Raquel Urtasun & Rich Zemels lectures

CSC 411 Lecture 6: Linear Regression Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla

CSC 411 Lecture 20: Gaussian Processes Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla

CSC 411: Lecture 02: Linear Regression Class based on Raquel Urtasun & Rich Zemels lectures

CSC 411 Lecture 11: Neural Networks II Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla

CSC 411 Lecture 3: Decision Trees Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla

CSC 411 Lecture 20: Closing Thoughts Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla

CSC 411 Lecture 19: Bayesian Linear Regression Roger Grosse, Amir-massoud Farahmand, and Juan

CSC 411: Lecture 19: Reinforcement Learning Class based on Raquel Urtasun & Rich Zemels

CSC 411: Lecture 08: Generative Models for Classification Class based on Raquel Urtasun &

CSC 411 Lecture 12: Principal Component Analysis Roger Grosse, Amir-massoud Farahmand, and Juan

CSC 411: Lecture 14: Principal Components Analysis & Autoencoders Class based on Raquel

CSC 411: Lecture 06: Decision Trees Class based on Raquel Urtasun & Rich Zemels lectures

CSC 411 Lecture 8: Linear Classification II Roger Grosse, Amir-massoud Farahmand, and Juan

CSC 411 Lecture 5: Ensembles II Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla

CSC 411: Lecture 12: Clustering Class based on Raquel Urtasun & Rich Zemels lectures Sanja

Advanced Analytics in Business [D0S07a] Big Data Platforms & Technologies [D0S06a] Artificial

Interactive Media and Game Development 2-D Tiles and Sprites Outline Tiles Sprites

Woodlands odlands Ri Ring g Prima imary y School hool Primary 4 Parents Briefing 30 30

Mixing R with other languages JOHN D. COOK, PHD SINGULAR VALUE CONSULTING Why R? Libraries,

Social Media and Opinion Mining ( ) 2016/10/25 ()

Deep Learning by Doing arconsis IT-Solutions GmbH Who? Wolfgang Frank Achim Baier

Introduction to Deep Learning Princeton University COS 495 Instructor: Yingyu Liang What is deep

Budget Trends SOCC Charges Business Model Proposal Discussion Budget Trends

CSC 411: Lecture 11: Neural Networks II Class based on Raquel - PowerPoint PPT Presentation

CSC 411: Lecture 11: Neural Networks II Class based on Raquel Urtasun & Rich Zemels lectures Sanja Fidler University of Toronto March 2, 2016 Urtasun, Zemel, Fidler (UofT) CSC 411: 11-Neural Networks II March 2, 2016 1 / 55 Today

CSC 411: Lecture 10: Neural Networks I Class based on Raquel Urtasun &amp; Rich Zemels lectures

CSC 411 Lecture 6: Linear Regression Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla

CSC 411 Lecture 20: Gaussian Processes Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla

CSC 411: Lecture 02: Linear Regression Class based on Raquel Urtasun &amp; Rich Zemels lectures

CSC 411 Lecture 11: Neural Networks II Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla

CSC 411 Lecture 3: Decision Trees Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla

CSC 411 Lecture 20: Closing Thoughts Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla

CSC 411 Lecture 19: Bayesian Linear Regression Roger Grosse, Amir-massoud Farahmand, and Juan

CSC 411: Lecture 19: Reinforcement Learning Class based on Raquel Urtasun &amp; Rich Zemels

CSC 411: Lecture 08: Generative Models for Classification Class based on Raquel Urtasun &amp;

CSC 411 Lecture 12: Principal Component Analysis Roger Grosse, Amir-massoud Farahmand, and Juan

CSC 411: Lecture 14: Principal Components Analysis &amp; Autoencoders Class based on Raquel

CSC 411: Lecture 06: Decision Trees Class based on Raquel Urtasun &amp; Rich Zemels lectures

CSC 411 Lecture 8: Linear Classification II Roger Grosse, Amir-massoud Farahmand, and Juan

CSC 411 Lecture 5: Ensembles II Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla

CSC 411: Lecture 12: Clustering Class based on Raquel Urtasun &amp; Rich Zemels lectures Sanja

Advanced Analytics in Business [D0S07a] Big Data Platforms &amp; Technologies [D0S06a] Artificial

Interactive Media and Game Development 2-D Tiles and Sprites Outline Tiles Sprites

Woodlands odlands Ri Ring g Prima imary y School hool Primary 4 Parents Briefing 30 30

Mixing R with other languages JOHN D. COOK, PHD SINGULAR VALUE CONSULTING Why R? Libraries,

Social Media and Opinion Mining ( ) 2016/10/25 ()

Deep Learning by Doing arconsis IT-Solutions GmbH Who? Wolfgang Frank Achim Baier

Introduction to Deep Learning Princeton University COS 495 Instructor: Yingyu Liang What is deep

Budget Trends SOCC Charges Business Model Proposal Discussion Budget Trends

CSC 411: Lecture 10: Neural Networks I Class based on Raquel Urtasun & Rich Zemels lectures

CSC 411: Lecture 02: Linear Regression Class based on Raquel Urtasun & Rich Zemels lectures

CSC 411: Lecture 19: Reinforcement Learning Class based on Raquel Urtasun & Rich Zemels

CSC 411: Lecture 08: Generative Models for Classification Class based on Raquel Urtasun &

CSC 411: Lecture 14: Principal Components Analysis & Autoencoders Class based on Raquel

CSC 411: Lecture 06: Decision Trees Class based on Raquel Urtasun & Rich Zemels lectures

CSC 411: Lecture 12: Clustering Class based on Raquel Urtasun & Rich Zemels lectures Sanja

Advanced Analytics in Business [D0S07a] Big Data Platforms & Technologies [D0S06a] Artificial