All You Want To Know About CNNs Yukun Zhu Deep Learning Deep - PowerPoint PPT Presentation

All You Want To Know About CNNs Yukun Zhu

Deep Learning

Deep Learning Image from http://imgur.com/

Deep Learning in Vision Object detection performance, PASCAL VOC 2010 33.4 DPM (2010)

Deep Learning in Vision Object detection performance, PASCAL VOC 2010 40.4 33.4 DPM segDPM (2010) (2014)

Deep Learning in Vision Object detection performance, PASCAL VOC 2010 53.7 40.4 33.4 DPM segDPM RCNN (2010) (2014) (2014)

Deep Learning in Vision Object detection performance, PASCAL VOC 2010 62.9 53.7 40.4 33.4 DPM segDPM RCNN RCNN* (2010) (2014) (2014) (Oct 2014)

Deep Learning in Vision Object detection performance, PASCAL VOC 2010 67.2 62.9 53.7 40.4 33.4 DPM segDPM RCNN RCNN* segRCNN (2010) (2014) (2014) (Oct 2014) (Jan 2015)

Deep Learning in Vision Object detection performance, PASCAL VOC 2010 70.8 67.2 62.9 53.7 40.4 33.4 DPM segDPM RCNN RCNN* segRCNN Fast RCNN (2010) (2014) (2014) (Oct 2014) (Jan 2015) (Jun 2015)

A Neuron Image from http://cs231n.github.io/neural-networks-1/

A Neuron in Neural Network Image from http://cs231n.github.io/neural-networks-1/

Activation Functions ● Sigmoid: f(x) = 1 / (1 + e -x ) ● ReLU: f(x) = max(0, x) ● Leaky ReLU: f(x) = max(ax, x) ● Maxout: f(x) = max(w 0 x + b 0 , w 1 x + b 1 ) ● and many others…

Neural Network (MLP) The network simulates a function y = f(x; w) Image modified from http://cs231n.github.io/neural-networks-1/

x 0 Forward Computation x 1 sigmoid f(x 0 , x 1 ) = 1 / (1 + exp(-(w 0 x 0 + w 1 x 1 + w 2 ))) 1 2.00 -1.00 -3.00 -2.00 -3.00 Image and code modified from http://cs231n.github.io/optimization-2/

x 0 Forward Computation x 1 sigmoid f(x 0 , x 1 ) = 1 / (1 + exp(-(w 0 x 0 + w 1 x 1 + w 2 ))) 1 2.00 -2.00 -1.00 -3.00 -2.00 -3.00 Image and code modified from http://cs231n.github.io/optimization-2/

x 0 Forward Computation x 1 sigmoid f(x 0 , x 1 ) = 1 / (1 + exp(-(w 0 x 0 + w 1 x 1 + w 2 ))) 1 2.00 -2.00 -1.00 -3.00 6.00 -2.00 -3.00 Image and code modified from http://cs231n.github.io/optimization-2/

x 0 Forward Computation x 1 sigmoid f(x 0 , x 1 ) = 1 / (1 + exp(-(w 0 x 0 + w 1 x 1 + w 2 ))) 1 2.00 -2.00 -1.00 4.00 -3.00 6.00 -2.00 -3.00 Image and code modified from http://cs231n.github.io/optimization-2/

x 0 Forward Computation x 1 sigmoid f(x 0 , x 1 ) = 1 / (1 + exp(-(w 0 x 0 + w 1 x 1 + w 2 ))) 1 2.00 -2.00 -1.00 4.00 -3.00 6.00 1.00 -2.00 -3.00 Image and code modified from http://cs231n.github.io/optimization-2/

x 0 Forward Computation x 1 sigmoid f(x 0 , x 1 ) = 1 / (1 + exp(-(w 0 x 0 + w 1 x 1 + w 2 ))) 1 2.00 -2.00 -1.00 4.00 -3.00 6.00 1.00 -1.00 -2.00 -3.00 Image and code modified from http://cs231n.github.io/optimization-2/

x 0 Forward Computation x 1 sigmoid f(x 0 , x 1 ) = 1 / (1 + exp(-(w 0 x 0 + w 1 x 1 + w 2 ))) 1 2.00 -2.00 -1.00 4.00 -3.00 6.00 1.00 -1.00 0.37 -2.00 -3.00 Image and code modified from http://cs231n.github.io/optimization-2/

x 0 Forward Computation x 1 sigmoid f(x 0 , x 1 ) = 1 / (1 + exp(-(w 0 x 0 + w 1 x 1 + w 2 ))) 1 2.00 -2.00 -1.00 4.00 -3.00 6.00 1.00 -1.00 0.37 1.37 -2.00 -3.00 Image and code modified from http://cs231n.github.io/optimization-2/

x 0 Forward Computation x 1 sigmoid f(x 0 , x 1 ) = 1 / (1 + exp(-(w 0 x 0 + w 1 x 1 + w 2 ))) 1 2.00 -2.00 -1.00 4.00 -3.00 6.00 1.00 -1.00 0.37 1.37 0.73 -2.00 -3.00 Image and code modified from http://cs231n.github.io/optimization-2/

Loss Function Loss function measures how well prediction matches true value Commonly used loss function: ● Squared loss: (y - y’) 2 ● Cross-entropy loss: -sum i (y i ’ * log(y i )) ● and many others

Loss Function During training, we would like to minimize the total loss on a set of training data ● We want to find w* = argmin{sum i [loss(f(x i ; w), y i )]}

Loss Function During training, we would like to minimize the total loss on a set of training data ● We want to find w* = argmin{sum i [loss(f(x i ; w), y i )]} ● Usually we use gradient based approach w t+1 = w t - a ∇ w ○

x 0 Backward Computation x 1 sigmoid f(x 0 , x 1 ) = 1 / (1 + exp(-(w 0 x 0 + w 1 x 1 + w 2 ))) 1 2.00 -2.00 -1.00 4.00 -3.00 6.00 1.00 -1.00 0.37 1.37 0.73 -2.00 1.00 -3.00 Image and code modified from http://cs231n.github.io/optimization-2/

x 0 Backward Computation x 1 sigmoid f(x 0 , x 1 ) = 1 / (1 + exp(-(w 0 x 0 + w 1 x 1 + w 2 ))) 1 2.00 -2.00 -1.00 4.00 -3.00 6.00 1.00 -1.00 0.37 1.37 0.73 -2.00 -0.53 1.00 f = 1/x -3.00 df/dx = -1/x 2 Image and code modified from http://cs231n.github.io/optimization-2/

x 0 Backward Computation x 1 sigmoid f(x 0 , x 1 ) = 1 / (1 + exp(-(w 0 x 0 + w 1 x 1 + w 2 ))) 1 2.00 -2.00 -1.00 4.00 -3.00 6.00 1.00 -1.00 0.37 1.37 0.73 -2.00 -0.53 -0.53 1.00 f = x + 1 -3.00 df/dx = 1 Image and code modified from http://cs231n.github.io/optimization-2/

x 0 Backward Computation x 1 sigmoid f(x 0 , x 1 ) = 1 / (1 + exp(-(w 0 x 0 + w 1 x 1 + w 2 ))) 1 2.00 -2.00 -1.00 4.00 -3.00 6.00 1.00 -1.00 0.37 1.37 0.73 -2.00 -0.20 -0.53 -0.53 1.00 f = e x -3.00 df/dx = e x Image and code modified from http://cs231n.github.io/optimization-2/

x 0 Backward Computation x 1 sigmoid f(x 0 , x 1 ) = 1 / (1 + exp(-(w 0 x 0 + w 1 x 1 + w 2 ))) 1 2.00 -2.00 -1.00 4.00 -3.00 6.00 1.00 -1.00 0.37 1.37 0.73 -2.00 0.20 -0.20 -0.53 -0.53 1.00 f = -x -3.00 df/dx = -1 Image and code modified from http://cs231n.github.io/optimization-2/

x 0 Backward Computation x 1 sigmoid f(x 0 , x 1 ) = 1 / (1 + exp(-(w 0 x 0 + w 1 x 1 + w 2 ))) 1 2.00 -2.00 -1.00 4.00 0.20 -3.00 6.00 1.00 -1.00 0.37 1.37 0.73 -2.00 0.20 -0.20 -0.53 -0.53 1.00 f = x + a df/dx = 1 -3.00 0.20 Image and code modified from http://cs231n.github.io/optimization-2/

x 0 Backward Computation x 1 sigmoid f(x 0 , x 1 ) = 1 / (1 + exp(-(w 0 x 0 + w 1 x 1 + w 2 ))) 1 2.00 -2.00 -1.00 0.20 4.00 0.20 -3.00 6.00 1.00 -1.00 0.37 1.37 0.73 -2.00 0.20 0.20 -0.20 -0.53 -0.53 1.00 -3.00 0.20 Image and code modified from http://cs231n.github.io/optimization-2/

x 0 Backward Computation x 1 sigmoid f(x 0 , x 1 ) = 1 / (1 + exp(-(w 0 x 0 + w 1 x 1 + w 2 ))) 1 2.00 -0.20 -2.00 -1.00 0.20 0.40 4.00 f = ax df/dx = a 0.20 -3.00 -0.40 6.00 1.00 -1.00 0.37 1.37 0.73 -2.00 0.20 0.20 -0.20 -0.53 -0.53 1.00 -0.60 -3.00 0.20 Image and code modified from http://cs231n.github.io/optimization-2/

Why NNs?

Universal Approximation Theorem A feed-forward network with a single hidden layer containing a finite number of neurons, can approximate continuous functions on compact subsets of R n , under mild assumptions on the activation function. https://en.wikipedia.org/wiki/Universal_approximation_theorem

Stone’s Theorem ● Suppose X is a compact Hausdorff space and B is a subalgebra in C(X, R) such that: B separates points. ○ B contains the constant function 1. ○ If f ∈ B then a f ∈ B for all constants a ∈ R. ○ If f , g ∈ B, then f + g , max{ f, g } ∈ B. ○ ● Then every continuous function defined on C(X, R) can be approximated as closely as desired by functions in B

Why CNNs?

Problems of MLP in Vision For input as a 10 * 10 image: ● A 3 layer MLP with 200 hidden units contains ~100k parameters For input as a 100 * 100 image: ● A 1 layer MLP with 20k hidden units contains ~200m parameters

Can We Do Better?

Can We Do Better? Based on such observation, MLP can be improved in two ways: ● Locally connected instead of fully connected ● Sharing weights between neurons We achieve those by using convolution neurons

Convolutional Layers Image from http://cs231n.github.io/convolutional-networks/

Convolutional Layers width height depth Image from http://cs231n.github.io/convolutional-networks/. See this page for an excellent example of convolution.

Pooling Layers Image from http://cs231n.github.io/convolutional-networks/

Pooling Layers Example: Max Pooling Image from http://cs231n.github.io/convolutional-networks/

Pooling Layers Commonly used pooling layers: ● Max pooling ● Average pooling Why pooling layers? ● Reduce activation dimensionality ● Robust against tiny shifts

CNN Architecture: An Example Image from http://cs231n.github.io/convolutional-networks/

Layer Activations for CNNs Conv:1 ReLU:1 Conv:2 ReLU:2 MaxPool:1 Conv:3 Image modified from http://cs231n.github.io/convolutional-networks/

Layer Activations for CNNs MaxPool:2 Conv:5 ReLU:5 Conv:6 ReLU:6 MaxPool:3 Image modified from http://cs231n.github.io/convolutional-networks/

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep - PowerPoint PPT Presentation

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from http://imgur.com/ Deep Learning Image from http://imgur.com/ Deep Learning Image from http://imgur.com/ Deep Learning Image from http://imgur.com/ Deep

Deep Learning for Geometry Processing 3D Representations View-Based and Volumetric CNNs 3D

What You Dont Know What You Dont Know What You Dont Know What You Dont Know That

The Art The Art when you don't know! Define what you want when you do know! of of Know

Understanding Geometry of Encoder-Decoder CNNs (E-D CNNs) Jong Chul Ye & Woon Kyoung Sung

Introduction to CNNs and RNNs with PyTorch Introduction to CNNs and RNNs with PyTorch Presented

(11-14) How much do you know about the internet? Make sure you stay SAFE AND SECURE ONLINE YOU

President NW TN Angel Fund, LLC Stand up if you know where you want that business to be in 5

Know how. Know now. Know how. Know now. Please Thank our sponsor! The Nebraska Soybean Board

WELCOME! You need to know what you know, and know what you dont know. Then work on your areas

We Know It ! We Know It ! WeKnowIt WeKnowIt Emerging, Collective Intelligence for personal,

Texture attribute synthesis and transfer using feed-forward CNNs Thomas Irmer, Tobias Glasmachers,

Distributed Optimization of CNNs and RNNs GTC 2015 William Chan williamchan.ca

Table of Contents Convolutional Neural Nets (CNNs) 1 Deep Q Learning 2 Lecture 6: CNNs and Deep

Table of Contents Convolutional Neural Nets (CNNs) 1 Deep Q Learning 2 Lecture 6: CNNs and Deep

Geirhos et al. (2019) Introduction ImageNet classifjcation with CNNs Which image cues are

Think about what careers you are interested in or what do you want to be when you grow up. Some of

Spa$ally Coupled Turbo-Like Codes: Convolu$onal Codes on Graphs

Lecture 04 Reliable Communication I-Hsiang Wang ihwang@ntu.edu.tw National Taiwan University

Sem Semanti tic c segm segmen enta tati tion on CV3DST | Prof. Leal-Taix 1 Ta Task d

Overcoming Delay, Synchronization and Cyclic Paths Meir Feder Tel Aviv University joint work

Extracting and Modeling Relations with Graph Convolutional Networks Ivan Titov with Diego

Assignment 5 Zahra Sheikhbahaee Zeou Hu & Colin Vandenhof April 2020 1 Convolutional

BRDF Evolution BRDFs have evolved historically 1970s: Empirical models Phongs

Computer Graphics - Material Models - Philipp Slusallek & Arsne Prard-Gayot Overview

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep - PowerPoint PPT Presentation

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from http://imgur.com/ Deep Learning Image from http://imgur.com/ Deep Learning Image from http://imgur.com/ Deep Learning Image from http://imgur.com/ Deep

Deep Learning for Geometry Processing 3D Representations View-Based and Volumetric CNNs 3D

What You Dont Know What You Dont Know What You Dont Know What You Dont Know That

The Art The Art when you don't know! Define what you want when you do know! of of Know

Understanding Geometry of Encoder-Decoder CNNs (E-D CNNs) Jong Chul Ye &amp; Woon Kyoung Sung

Introduction to CNNs and RNNs with PyTorch Introduction to CNNs and RNNs with PyTorch Presented

(11-14) How much do you know about the internet? Make sure you stay SAFE AND SECURE ONLINE YOU

President NW TN Angel Fund, LLC Stand up if you know where you want that business to be in 5

Know how. Know now. Know how. Know now. Please Thank our sponsor! The Nebraska Soybean Board

WELCOME! You need to know what you know, and know what you dont know. Then work on your areas

We Know It ! We Know It ! WeKnowIt WeKnowIt Emerging, Collective Intelligence for personal,

Texture attribute synthesis and transfer using feed-forward CNNs Thomas Irmer, Tobias Glasmachers,

Distributed Optimization of CNNs and RNNs GTC 2015 William Chan williamchan.ca

Table of Contents Convolutional Neural Nets (CNNs) 1 Deep Q Learning 2 Lecture 6: CNNs and Deep

Table of Contents Convolutional Neural Nets (CNNs) 1 Deep Q Learning 2 Lecture 6: CNNs and Deep

Geirhos et al. (2019) Introduction ImageNet classifjcation with CNNs Which image cues are

Think about what careers you are interested in or what do you want to be when you grow up. Some of

Spa$ally Coupled Turbo-Like Codes: Convolu$onal Codes on Graphs

Lecture 04 Reliable Communication I-Hsiang Wang ihwang@ntu.edu.tw National Taiwan University

Sem Semanti tic c segm segmen enta tati tion on CV3DST | Prof. Leal-Taix 1 Ta Task d

Overcoming Delay, Synchronization and Cyclic Paths Meir Feder Tel Aviv University joint work

Extracting and Modeling Relations with Graph Convolutional Networks Ivan Titov with Diego

Assignment 5 Zahra Sheikhbahaee Zeou Hu &amp; Colin Vandenhof April 2020 1 Convolutional

BRDF Evolution BRDFs have evolved historically 1970s: Empirical models Phongs

Computer Graphics - Material Models - Philipp Slusallek &amp; Arsne Prard-Gayot Overview

Understanding Geometry of Encoder-Decoder CNNs (E-D CNNs) Jong Chul Ye & Woon Kyoung Sung

Assignment 5 Zahra Sheikhbahaee Zeou Hu & Colin Vandenhof April 2020 1 Convolutional

Computer Graphics - Material Models - Philipp Slusallek & Arsne Prard-Gayot Overview