Neural Networks Continue Instructor: Bhiksha Raj Slides by Najim - PowerPoint PPT Presentation

ImageNet • 1.2 million high-resolution images from ImageNet LSVRC-2010 contest • 1000 different classes (sofmax layer) • NN configuration • NN contains 60 million parameters and 650,000 neurons, • 5 convolutional layers, some of which are followed by max-pooling layers • 3 fully-connected layers Krizhevsky, A., Sutskever, I. and Hinton, G. E . “ImageNet Classification with Deep Convolutional Neural Networks” NIPS 2012: Neural Information Processing Systems, Lake Tahoe, Nevada

ImageNet Figure 3: 96 convolutional kernels of size 11×11×3 learned by the first convolutional layer on the 224×224×3 input images. The top 48 kernels were learned on GPU 1 while the bottom 48 kernels were learned on GPU 2. See Section 6.1 for details. Krizhevsky, A., Sutskever, I. and Hinton, G. E . “ImageNet Classification with Deep Convolutional Neural Networks” NIPS 2012: Neural Information Processing Systems, Lake Tahoe, Nevada

ImageNet Five ILSVRC-2010 test images in the first Eight ILSVRC-2010 test images and the five column. The remaining columns show the six labels considered most probable by our model. training images that produce feature vectors in The correct label is written under each image, the last hidden layer with the smallest Euclidean and the probability assigned to the correct label distance from the feature vector for the test is also shown with a red bar (if it happens to be image. in the top 5). Krizhevsky, A., Sutskever, I. and Hinton, G. E . “ImageNet Classification with Deep Convolutional Neural Networks” NIPS 2012: Neural Information Processing Systems, Lake Tahoe, Nevada

CNN for Automatic Speech Recognition • Convolution over frequencies • Convolution over time

CNN-Recap Feature maps • Neural network with specialized connectivity structure Pooling • Feed-forward: - Convolve input - Non-linearity (rectified linear) Non-linearity - Pooling (local max) • Supervised training Convolution • Train convolutional filters by back-propagating error (Learned) • Convolution over time • Adding memory to classical MLP network Input image • Recurrent neural network

Recurrent Neural Networks (RNNs) Recurrent Neural Network Recurrent networks introduce (RNN) cycles and a notion of time. 𝑦 𝑢 𝑧 𝑢 ℎ 𝑢−1 ℎ 𝑢 One-step delay • They are designed to process sequences of data 𝑦 1 , … , 𝑦 𝑜 and can produce sequences of outputs 𝑧 1 , … , 𝑧 𝑛 .

Elman Nets (1990) – Simple Recurrent Neural Networks • Elman nets are feed forward networks with partial recurrence • Unlike feed forward nets, Elman nets have a memory or sense of time • Can also be viewed as a “Markovian” NN

(Vanilla) Recurrent Neural Network Simple Recurrent Neural Network The state consists of a single “hidden” vector h : 𝑦 𝑢 𝑧 𝑢 ℎ 𝑢−1 ℎ 𝑢 One-step delay

Unrolling RNNs Recurrent Neural Network RNNs can be unrolled across multiple time steps. 𝑦 𝑢 𝑧 𝑢 ℎ 𝑢−1 ℎ 𝑢 𝑧 0 𝑧 1 𝑧 2 ℎ 0 ℎ 1 ℎ 2 One-step delay This produces a DAG which supports backpropagation. 𝑦 0 𝑦 1 𝑦 2 But its size depends on the input sequence length.

Learning time sequences • Recurrent networks have one more or more feedback loops • There are many tasks that require learning a temporal sequence of events – Speech, video, Text, Market • These problems can be broken into 3 distinct types of tasks 1. Sequence Recognition: Produce a particular output pattern when a specific input sequence is seen. Applications: speech recognition 2. Sequence Reproduction: Generate the rest of a sequence when the network sees only part of the sequence. Applications: Time series prediction (stock market, sun spots, etc) 3. Temporal Association : Produce a particular output sequence in response to a specific input sequence. Applications: speech generation

RNN structure Recurrent Neural Network Often layers are stacked vertically (deep RNNs): 𝑧 10 𝑧 12 𝑧 11 ℎ 10 ℎ 12 ℎ 11 Same parameters at this level 𝑦 00 𝑦 02 𝑦 01  Abstraction 𝑧 00 𝑧 02 𝑧 01 ℎ 00 - Higher ℎ 02 ℎ 01 Same parameters level at this level features 𝑦 0 𝑦 1 𝑦 2 Time

RNN structure Recurrent Neural Network Backprop still works: (it called Backpropagation Through Time ) 𝑧 10 𝑧 12 𝑧 11 ℎ 10 ℎ 12 ℎ 11 𝑦 00 𝑦 02 𝑦 01 Activations Abstraction 𝑧 00 𝑧 02 𝑧 01 ℎ 00 - Higher ℎ 02 ℎ 01 level features 𝑦 0 𝑦 1 𝑦 2 Time

RNN structure Recurrent Neural Network Backprop still works: 𝑧 10 𝑧 12 𝑧 11 ℎ 10 ℎ 12 ℎ 11 𝑦 00 𝑦 02 𝑦 01 Activations Abstraction 𝑧 00 𝑧 02 𝑧 01 ℎ 00 - Higher ℎ 02 ℎ 01 level features 𝑦 0 𝑦 1 𝑦 2 Time

RNN structure Recurrent Neural Network Backprop still works: 𝑧 10 𝑧 12 𝑧 11 ℎ 10 ℎ 12 ℎ 11 𝑦 00 𝑦 02 𝑦 01 Gradients Abstraction 𝑧 00 𝑧 02 𝑧 01 ℎ 00 - Higher ℎ 02 ℎ 01 level features 𝑦 0 𝑦 1 𝑦 2 Time

The memory problem with RNN • RNN models signal context • If very long context is used -> RNNs become unable to learn the context information

Standard RNNs to LSTM Standard LSTM

LSTM illustrated: input and forming new memory Cell state LSTM cell takes the following input • the input 𝑦 𝑢 • Forget gate past memory output ℎ 𝑢−1 • past memory 𝐷 𝑢−1 Input gate (all vectors) New memory

LSTM illustrated: Output • Forming the output of the cell by using output gate Overall picture:

LSTM Equations 𝑗 = 𝜏 𝑦 𝑢 𝑉 𝑗 + 𝑡 𝑢−1 𝑋 𝑗 • 𝑔 = 𝜏 𝑦 𝑢 𝑉 𝑔 + 𝑡 𝑢−1 𝑋 𝑔 • 𝑝 = 𝜏 𝑦 𝑢 𝑉 𝑝 + 𝑡 𝑢−1 𝑋 𝑝 • • 𝒋: input gate, how much of the new 𝑕 = tanh 𝑦 𝑢 𝑉 𝑕 + 𝑡 𝑢−1 𝑋 𝑕 • information will be let through the memory • 𝑑 𝑢 = 𝑑 𝑢−1 ∘ 𝑔 + 𝑕 ∘ 𝑗 cell. • 𝑡 𝑢 = tanh 𝑑 𝑢 ∘ 𝑝 • 𝒈 : forget gate, responsible for information • 𝑧 = 𝑡𝑝𝑔𝑢𝑛𝑏𝑦 𝑊𝑡 𝑢 should be thrown away from memory cell. • 𝒑: output gate, how much of the information will be passed to expose to the next time step. • 𝒉: self-recurrent which is equal to standard RNN • 𝒅 𝒖 : internal memory of the memory cell LSTM Memory Cell • 𝒕 𝒖 : hidden state • 𝐳 : final output 92

LSTM output synchronization

(NLP) Applications of RNNs • Section overview – Language Model – Sentiment analysis / text classification – Machine translation and conversation modeling – Sentence skip-thought vectors

RNN for

Sentiment analysis / text classification • A quick example, to see the idea. • Given text collections and their labels. Predict labels for unseen texts.

Translating Videos to Natural Language Using Deep Recurrent Neural Networks Translating Videos to Natural Language Using Deep Recurrent Neural Networks Subhashini Venugopalan, Huijun Xu, Jeff Donahue, Marcus Rohrbach, Raymond Mooney, Kate Saenko North American Chapter of the Association for Computational Linguistics, Denver, Colorado, June 2015.

Composing music with RNN http://www.hexahedria.com/2015/08/03/composing-music-with-recurrent-neural-networks/

CNN-LSTM-DNN for speech recognition • Ensembles of RNN/LSTM, DNN, & Conv Nets (CNN) give huge gains (state of the art): • T. Sainath, O. Vinyals, A. Senior, H. Sak. “ Convolutional, Long Short-Term Memory, Fully Connected Deep Neural Networks,” ICASSP 2015.

Neural Networks Continue Instructor: Bhiksha Raj Slides by Najim - PowerPoint PPT Presentation

Machine Learning for Signal Processing Neural Networks Continue Instructor: Bhiksha Raj Slides by Najim Dehak 1 Dec 2016 1 So what are neural networks?? Voice Image N.Net N.Net Text caption Transcription signal Game N.Net Next move

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

PRESS ROOT TO PRESS ROOT TO CONTINUE: PRESS ROOT TO PRESS ROOT TO CONTINUE: PRESS ROOT TO

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I :

CHAPTER II III I CHAPTER Neural Networks as Neural Networks as Associative Memory

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Neural Networks 0. Logistics Spring 2019 1 Neural Networks are taking over! Neural networks

Neural Networks and their Application to Go Neural Networks Learning Blackjack Theory Training

Neural Networks 1. Introduction Fall 2017 Neural Networks are taking over! Neural networks

Neural Networks Neural Net Basics Dan Klein, John DeNero UC Berkeley Slides adapted from Greg

Relaxation and Hopfield Networks Neural Networks Neural Networks - Hopfield 1 Bibliography

Neural Networks 1. Introduction Spring 2020 1 Neural Networks are taking over! Neural

Introduction to Artificial Intelligence Neural Networks - Deep Learning for NLP Janyl Jumadinova

M O OCS AND O PEN ACCESS Konrad M. Lawson Lecturer in Modern History University of St Andrews

LibreUmbria, an update on the Green Migration Alfredo Parisi alfredo@libreitalia.it 1

HQ Air Force Space Command Air Force Space Command Lead USAF Major Command for Cyberspace

End-toEnd In-memory Graph Analytics Jure Leskovec (@jure) Including joint work with Rok Sosic,

Spatial Indexing Ramakrishnan/Gehrke Ch. 28 340151 Big Data & Cloud Services (P. Baumann)

Spatial Data Management Chapter 28 Database management Systems, 3ed, R. Ramakrishnan and J.

1. What are Information Systems? 1.1 Introduction to Information Systems 1.2 What Is

Practical Considerations on the Use of Preference Learning for Ranking Emotional Speech R EZA L

Neural Networks Continue Instructor: Bhiksha Raj Slides by Najim - PowerPoint PPT Presentation

Machine Learning for Signal Processing Neural Networks Continue Instructor: Bhiksha Raj Slides by Najim Dehak 1 Dec 2016 1 So what are neural networks?? Voice Image N.Net N.Net Text caption Transcription signal Game N.Net Next move

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

PRESS ROOT TO PRESS ROOT TO CONTINUE: PRESS ROOT TO PRESS ROOT TO CONTINUE: PRESS ROOT TO

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I :

CHAPTER II III I CHAPTER Neural Networks as Neural Networks as Associative Memory

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Neural Networks 0. Logistics Spring 2019 1 Neural Networks are taking over! Neural networks

Neural Networks and their Application to Go Neural Networks Learning Blackjack Theory Training

Neural Networks 1. Introduction Fall 2017 Neural Networks are taking over! Neural networks

Neural Networks Neural Net Basics Dan Klein, John DeNero UC Berkeley Slides adapted from Greg

Relaxation and Hopfield Networks Neural Networks Neural Networks - Hopfield 1 Bibliography

Neural Networks 1. Introduction Spring 2020 1 Neural Networks are taking over! Neural

Introduction to Artificial Intelligence Neural Networks - Deep Learning for NLP Janyl Jumadinova

M O OCS AND O PEN ACCESS Konrad M. Lawson Lecturer in Modern History University of St Andrews

LibreUmbria, an update on the Green Migration Alfredo Parisi alfredo@libreitalia.it 1

HQ Air Force Space Command Air Force Space Command Lead USAF Major Command for Cyberspace

End-toEnd In-memory Graph Analytics Jure Leskovec (@jure) Including joint work with Rok Sosic,

Spatial Indexing Ramakrishnan/Gehrke Ch. 28 340151 Big Data &amp; Cloud Services (P. Baumann)

Spatial Data Management Chapter 28 Database management Systems, 3ed, R. Ramakrishnan and J.

1. What are Information Systems? 1.1 Introduction to Information Systems 1.2 What Is

Practical Considerations on the Use of Preference Learning for Ranking Emotional Speech R EZA L

Spatial Indexing Ramakrishnan/Gehrke Ch. 28 340151 Big Data & Cloud Services (P. Baumann)