Convolutional Neural Networks (CNNs) and Recurrent Neural Networks - PowerPoint PPT Presentation

Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) CMSC 678 UMBC

Recap from last time…

Feed-Forward Neural Network: Multilayer Perceptron 𝑧 𝑦 ℎ 𝑈 𝑦 + 𝑐 0 ) 𝐱 𝟐 ℎ 𝑗 = 𝐺(𝐱 𝐣 𝑈 ℎ + 𝑐 1 ) 𝐱 𝟑 𝐱 𝟒 𝐱 𝟓 𝑧 𝑘 = G(𝛄 𝐤 F : (non-linear) activation G: (non-linear) activation function function Classification: softmax 𝜸 Regression: identity 𝑧 1 𝑧 2 information/ no self-loops computation flow (recurrence/reuse of weights)

Flavors of Gradient Descent “Online” “ Minibatch ” “Batch” Set t = 0 Set t = 0 Set t = 0 Pick a starting value θ t Pick a starting value θ t Pick a starting value θ t Until converged: Until converged: Until converged: get batch B ⊂ full data set g t = 0 set g t = 0 for example i in full data: for example(s) i in B: for example(s) i in full data: 1. Compute loss l on x i 1. Compute loss l on x i 1. Compute loss l on x i 2. Get gradient 2. Accumulate gradient 2. Accumulate gradient g t = l’(x i ) g t += l’(x i ) g t += l’( x i ) 3. Get scaling factor ρ t done done 4. Set θ t+1 = θ t - ρ t *g t Get scaling factor ρ t Get scaling factor ρ t 5. Set t += 1 Set θ t+1 = θ t - ρ t *g t Set θ t+1 = θ t - ρ t *g t done Set t += 1 Set t += 1

Dropout: Regularization in Neural Networks 𝑧 𝑦 ℎ 𝐱 𝟐 𝐱 𝟑 𝐱 𝟒 𝐱 𝟓 𝜸 𝑧 1 𝑧 2 randomly ignore “neurons” (h i ) during training

tanh Activation s=10 2 tanh s 𝑦 = 1 + exp(−2 ∗ 𝑡 ∗ 𝑦) − 1 s=1 = 2𝜏 𝑡 𝑦 − 1 s=0.5

Rectifiers Activations relu 𝑦 = max(0, 𝑦) softplus 𝑦 = log(1 + exp 𝑦 ) leaky_relu 𝑦 = ቊ0.01𝑦, 𝑦 < 0 𝑦, 𝑦 ≥ 0

Outline Convolutional Neural Recurrent Neural Networks Networks What is a convolution? Types of recurrence Multidimensional A basic recurrent cell Convolutions BPTT: Backpropagation Typical Convnet Operations through time Deep convnets

Dot Product ∑ 𝑦 𝑈 𝑧 = ෍ 𝑦 𝑙 𝑧 𝑙 𝑙

Convolution: Modified Dot Product Around a Point 𝑦 𝑈 𝑧 𝑗 = ෍ 𝑦 𝑙+𝑗 𝑧 𝑙 𝑙<𝐿 ∑ Convolution/cross-correlation

Convolution: Modified Dot Product Around a Point 𝑦 𝑈 𝑧 𝑗 = ෍ 𝑦 𝑙+𝑗 𝑧 𝑙 𝑙 ∑ 𝑦 ⋆ 𝑧 𝑗 = Convolution/cross-correlation

Convolution: Modified Dot Product Around a Point 1-D 𝑦 𝑈 𝑧 𝑗 = ෍ 𝑦 𝑙+𝑗 𝑧 𝑙 convolution 𝑙 input (“image”) ∑ kernel 𝑦 ⋆ 𝑧 = feature map Convolution/cross-correlation

Outline Convolutional Neural Recurrent Neural Networks Networks What is a convolution? Types of recurrence Multidimensional A basic recurrent cell Convolutions BPTT: Backpropagation Typical Convnet Operations through time Deep convnets

2-D Convolution kernel width : shape of the kernel input (often square) (“image”)

2-D Convolution stride(s) : how many spaces to move the kernel width : shape of the kernel (often square) input (“image”)

2-D Convolution stride(s) : how many spaces to move the kernel stride=1 width : shape of the kernel (often square) input (“image”)

2-D Convolution stride(s) : how many spaces to move the kernel stride=2 width : shape of the kernel (often square) input (“image”)

2-D Convolution skip starting here stride(s) : how many spaces to move the kernel stride=2 width : shape of the kernel (often square) input (“image”)

2-D Convolution pad with 0s (one option) stride(s) : how many spaces to move the kernel padding : how to handle input/kernel shape mismatches width : shape of the kernel (often square) input (“image”) “same” : “different” : input.shape == output.shape input.shape ≠ output.shape

2-D Convolution pad with 0s pad with 0s (another option) (another option) stride(s) : how many spaces to move the kernel padding : how to handle input/kernel shape mismatches width : shape of the kernel (often square) input (“image”) “same” : “different” : input.shape == output.shape input.shape ≠ output.shape

2-D Convolution stride(s) : how many spaces to move the kernel padding : how to handle input/kernel shape mismatches width : shape of the kernel (often square) input (“image”) “same” : “different” : input.shape == output.shape input.shape ≠ output.shape

From fully connected to convolutional networks image Fully connected layer Slide credit: Svetlana Lazebnik

From fully connected to convolutional networks feature map learned weights image Convolutional layer Slide credit: Svetlana Lazebnik

Convolution as feature extraction Filters/Kernels . . . Feature Map Input Slide credit: Svetlana Lazebnik

From fully connected to convolutional networks feature map learned weights image Convolutional layer Slide credit: Svetlana Lazebnik

From fully connected to convolutional networks non-linearity and/or pooling next layer image Convolutional layer Slide adapted: Svetlana Lazebnik

Outline Convolutional Neural Recurrent Neural Networks Networks What is a convolution? Types of recurrence Multidimensional A basic recurrent cell Convolutions BPTT: Backpropagation Typical Convnet Operations through time Deep convnets Solving vanishing gradients problem

Key operations in a CNN Feature maps Spatial pooling Non-linearity Convolution . (Learned) . . Input Image Feature Map Input Slide credit: Svetlana Lazebnik, R. Fergus, Y. LeCun

Key operations Example: Rectified Linear Unit (ReLU) Feature maps Spatial pooling Non-linearity Convolution (Learned) Input Image Slide credit: Svetlana Lazebnik, R. Fergus, Y. LeCun

Key operations Feature maps Max Spatial pooling Non-linearity Convolution (Learned) Input Image Slide credit: Svetlana Lazebnik, R. Fergus, Y. LeCun

Design principles Reduce filter sizes (except possibly at the lowest layer), factorize filters aggressively Use 1x1 convolutions to reduce and expand the number of feature maps judiciously Use skip connections and/or create multiple paths through the network Slide credit: Svetlana Lazebnik

LeNet-5 Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, Gradient-based learning applied to document recognition, Proc. IEEE 86(11): 2278 – 2324, 1998. Slide credit: Svetlana Lazebnik

ImageNet ~14 million labeled images, 20k classes Images gathered from Internet Human labels via Amazon MTurk ImageNet Large-Scale Visual Recognition Challenge (ILSVRC): 1.2 million training images, 1000 classes www.image-net.org/challenges/LSVRC/ Slide credit: Svetlana Lazebnik

Slide credit: Svetlana Lazebnik http://www.inference.vc/deep-learning-is-easy/

Outline Convolutional Neural Recurrent Neural Networks Networks What is a convolution? Types of recurrence Multidimensional A basic recurrent cell Convolutions BPTT: Backpropagation Typical Convnet Operations through time Deep convnets Solving vanishing gradients problem

AlexNet: ILSVRC 2012 winner Similar framework to LeNet but: Max pooling, ReLU nonlinearity More data and bigger model (7 hidden layers, 650K units, 60M params) GPU implementation (50x speedup over CPU): Two GPUs for a week Dropout regularization A. Krizhevsky, I. Sutskever, and G. Hinton, ImageNet Classification with Deep Convolutional Neural Networks, NIPS 2012 Slide credit: Svetlana Lazebnik

GoogLeNet Szegedy et al., 2015 Slide credit: Svetlana Lazebnik

GoogLeNet: Auxiliary Classifier at Sub- levels Idea: try to make each sub-layer good (in its own way) at the prediction task Szegedy et al., 2015 Slide credit: Svetlana Lazebnik

GoogLeNet • An alternative view: Szegedy et al., 2015 Slide credit: Svetlana Lazebnik

Convolutional Neural Networks (CNNs) and Recurrent Neural Networks - PowerPoint PPT Presentation

Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) CMSC 678 UMBC Recap from last time Feed-Forward Neural Network: Multilayer Perceptron + 0 ) = (

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I :

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Convolutional Neural Networks (CNNs) Recurrent Neural Networks (RNNs) L1 Scalar Processor L0

Convolutional Neural Networks ---- Off the shelf top notch performances Convolutional Neural

CS6501: Deep Learning for Visual Recognition Recurrent Neural Networks (RNNs) Todays Class

CS6501: Deep Learning for Visual Recognition Recurrent Neural Networks (RNNs) Todays Class

Recurrent Neural Networks Greg Mori - CMPT 419/726 Goodfellow, Bengio, and Courville: Deep

Convolutional Kuan-Ting Lai 2020/3/31 Neural Network Convolutional Neural Networks (CNN)

CHAPTER VII VII CHAPTER Learning in Recurrent Networks Learning in Recurrent Networks CHAPTER

The Power of Linear Recurrent Neural Networks Neural Networks Was knnen lineare rekurrente

Recurrent Neural Network Xiaogang Wang xgwang@ee.cuhk.edu.hk February 26, 2019 cuhk Xiaogang

CONVOLUTIONAL AND RECURRENT NEURAL NETWORKS Neural networks Fully connected networks

Introduction CSCE 970 CSCE 970 Lecture 4: Lecture 4: Convolutional Convolutional Neural

Convolutional Neural Networks in Speech Lecture 20 CS 753 Instructor: Preethi Jyothi

15-780 Graduate Artificial Intelligence: Convolutional and recurrent networks J. Zico Kolter

Neural Network Part 4: Recurrent Neural Networks Yingyu Liang Computer Sciences 760 Fall 2017

Geometry-Aware Deep Visual Learning Katerina Fragkiadaki zebras How this talk fits the workshop

Sparse Attentive Backtracking: Temporal credit assignment through reminding Nan Rosemary Ke 1,2 ,

Normalizing tweets with edit scripts and recurrent neural embeddings Grzegorz Chrupaa |

Natural Language Processing with Deep Learning Language Modeling with Recurrent Neural Networks

Recurrent Neural Networks CS 6956: Deep Learning for NLP Overview 1. Modeling sequences 2.

Recurrent Neural Network Attention Mechanisms for Interpretable System Log Anomaly Detection

Introduction to Recurrent Neural Networks Jakob Verbeek Modeling sequential data with Recurrent