Lecture 10: Recurrent Neural Networks CS109B Data Science 2 Pavlos - PowerPoint PPT Presentation

Lecture 10: Recurrent Neural Networks CS109B Data Science 2 Pavlos Protopapas and Mark Glickman

Sequence Modeling: Handwritten Text Translation Input : Image • Output: Text • https://towardsdatascience.com/build-a-handwritten-text-recognition-system-using-tensorflow- 2326a3487cd5 CS109B, P ROTOPAPAS , G LICKMAN 2

Sequence Modeling: Text-to-Speech Input : Audio • Output: Text • CS109B, P ROTOPAPAS , G LICKMAN 3

Sequence Modeling: Machine Translation Input : Text • Output: Translated Text • CS109B, P ROTOPAPAS , G LICKMAN 4

Rapping-neural-network https://github.com/robbiebarrat/rapping-neural-network CS109B, P ROTOPAPAS , G LICKMAN 5

Outline Why RNNs Main Concept of RNNs More Details of RNNs RNN training Gated RNN CS109B, P ROTOPAPAS , G LICKMAN 6

What can my NN do? Training: Present to the NN examples and learn from them. [George, Mary, Tom, Suzie] NN CS109B, P ROTOPAPAS , G LICKMAN 8

What can my NN do? Prediction : Given an example NN George NN Mary CS109B, P ROTOPAPAS , G LICKMAN 9

What my NN can NOT do? ? WHO IS IT? CS109B, P ROTOPAPAS , G LICKMAN 10

Learn from previous examples Time CS109B, P ROTOPAPAS , G LICKMAN 11

Recurrent Neural Network (RNN) George NN CS109B, P ROTOPAPAS , G LICKMAN 12

Recurrent Neural Network (RNN) I have seen George moving in this way before. George NN RNNs recognize the data's sequential characteristics and use patterns to predict the next likely scenario. CS109B, P ROTOPAPAS , G LICKMAN 13

Recurrent Neural Network (RNN) I do not know. I need to know WHO IS He told me I could have it who said that and what he HE? said before. Can you tell me more? Our model requires context - or contextual information - to understand the subject (he) and the direct object (it) in the sentence. CS109B, P ROTOPAPAS , G LICKMAN 14

RNN – Another Example with Text I see what you mean now! - Hellen: Nice sweater Joe. WHO IS The noun “he” stands for - Joe: Thanks, Hellen. It used Joe’s brother while ”it” for to belong to my brother and he HE? the sweater. told me I could have it . After providing sequential information, the model understood the subject (Joe’s brother) and the direct object (sweater) in the sentence . CS109B, P ROTOPAPAS , G LICKMAN 15

Sequences • We want a machine learning model to understand sequences, not isolated samples. • Can MLP do this? • Assume we have a sequence of temperature measurements and we want to take 3 sequential measurements and predict the next one features 35 1 32 2 45 3 samples 48 4 41 5 39 6 36 7 … … CS109B, P ROTOPAPAS , G LICKMAN 16

Sequences • We want a machine learning model to understand sequences, not isolated samples. • Can MLP do this? • Assume we have a sequence of temperature measurements and we want to take 3 sequential measurements and predict the next one features 35 35 1 1 32 32 2 2 45 45 3 3 samples 48 48 4 4 41 5 39 6 36 7 … … CS109B, P ROTOPAPAS , G LICKMAN 17

Sequences • We want a machine learning model to understand sequences, not isolated samples. • Can MLP do this? • Assume we have a sequence of temperature measurements and we want to take 3 sequential measurements and predict the next one features 35 35 32 1 1 2 32 32 45 2 2 3 45 45 48 3 3 4 samples 48 48 41 4 4 5 41 5 39 6 36 7 … … CS109B, P ROTOPAPAS , G LICKMAN 18

Sequences • We want a machine learning model to understand sequences, not isolated samples. • Can MLP do this? • Assume we have a sequence of temperature measurements and we want to take 3 sequential measurements and predict the next one features 35 35 32 45 1 1 2 3 32 32 45 48 2 2 3 4 45 45 48 41 3 3 4 5 samples 48 48 41 39 4 4 5 6 41 5 39 6 36 7 … … CS109B, P ROTOPAPAS , G LICKMAN 19

Sequences • We want a machine learning model to understand sequences, not isolated samples. • Can MLP do this? • Assume we have a sequence of temperature measurements and we want to take 3 sequential measurements and predict the next one features 35 35 32 45 1 1 2 3 32 32 45 48 2 2 3 4 45 45 48 41 3 3 4 5 samples 48 48 41 39 4 4 5 6 41 5 39 6 36 7 … … CS109B, P ROTOPAPAS , G LICKMAN 20

Windowed dataset This is called overlapping windowed dataset, since we’re windowing observations to create new. We can easily do using a MLS: 1 3 10 ReLU 10 ReLU 1 ReLU But re-arranging the order of the inputs like: 45 32 41 3 2 5 35 48 48 1 4 4 32 45 45 2 3 3 4 5 6 will produce the same results CS109B, P ROTOPAPAS , G LICKMAN 21

Why not CNNs or MLPs? 1. MLPs/CNNs require fixed input and output size 2. MLPs/CNNs can’t classify inputs in multiple places CS109B, P ROTOPAPAS , G LICKMAN 22

Windowed dataset What follows after: ‘I got in the car and’ ? drove away What follows after : ‘In car the and I’ ? Not obvious it should be ‘drove away’ The order of words matters. This is true for most sequential data. A fully connected network will not distinguish the order and therefore missing some information. CS109B, P ROTOPAPAS , G LICKMAN 23

Memory Somehow the computational unit should remember what it has seen before. 𝑍 " Should remember 𝑌 $ … 𝑌 "&' Unit 𝑌 " CS109B, P ROTOPAPAS , G LICKMAN 25

Memory Somehow the computational unit should remember what it has seen before. 𝑍 " Unit Internal memory 𝑌 " CS109B, P ROTOPAPAS , G LICKMAN 26

Memory Somehow the computational unit should remember what it has seen before. We’ll call the information the unit’s state . 𝑍 " RNN Internal memory 𝑌 " CS109B, P ROTOPAPAS , G LICKMAN 27

Memory In neural networks, once training is over, the weights do not change. This means that the network is done learning and done changing. Then, we feed in values, and it simply applies the operations that make up the network, using the values it has learned. But the RNN units are able to remember new information after training has completed. That is, they’re able to keep changing after training is over. CS109B, P ROTOPAPAS , G LICKMAN 28

Memory Question : How can we do this? How can build a unit that remembers the past? The memory or state can be written to a file but in RNNs, we keep it inside the recurrent unit. In an array or in a vector! Work with an example: Anna Sofia said her shoes are too ugly. Her here means Anna Sofia. Nikolas put his keys on the table. His here means Nikolas CS109B, P ROTOPAPAS , G LICKMAN 29

Memory Question: How can we do this? How can build a unit that remembers the past? The memory or state can be written to a file but in RNNs, we keep it inside the recurrent unit. In an array or in a vector! 𝑍 " (𝑓. 𝑕. 𝑂𝑗𝑙𝑝𝑚𝑏𝑡) Memory Memory RNN RNN 𝑌 " (𝑓. 𝑕. ℎ𝑗𝑡) CS109B, P ROTOPAPAS , G LICKMAN 30

Building an RNN 𝑍 " Memory Memory RNN 𝑌 " 𝑍 𝑍 𝑍 𝑍 "68 "67 "6' " RNN Memory RNN Memory RNN Memory Memory RNN Memory 𝑌 "67 𝑌 "68 𝑌 " 𝑌 "6' CS109B, P ROTOPAPAS , G LICKMAN 31

Structure of an RNN cell 𝑍 𝑍 " " update output weight weight State RNN State RNN input weight 𝑌 " 𝑌 " CS109B, P ROTOPAPAS , G LICKMAN 33

CS109B, P ROTOPAPAS , G LICKMAN Image taken from A. Glassner, Deep Learning, Vol. 2: From Basics to Practice 34

Backprop Through Time • For each input, unfold network for the sequence length T • Back-propagation: apply forward and backward pass on unfolded network • Memory cost: O ( T ) CS109B, P ROTOPAPAS , G LICKMAN 36

Backprop Through Time 𝑍 " update output weight weight State RNN input weight 𝑌 " CS109B, P ROTOPAPAS , G LICKMAN 37

Backprop Through Time 𝑍 " update output weight weight State RNN 𝑌 " CS109B, P ROTOPAPAS , G LICKMAN 38

Backprop Through Time 𝑍 " Output Weights: W Update Weights: U ℎ " State RNN Input Weights: V 𝑌 " CS109B, P ROTOPAPAS , G LICKMAN 39

Backprop Through Time 𝑧 < t-1 𝑧 < t 𝑧 < t-2 W W W U U U ℎ "&7 ℎ "&' ℎ " V V V 𝑌 "&7 𝑌 "&' 𝑌 " You have two activation functions 𝑕 9 which serves as the activation for the hidden state and 𝑕 : which is the activation of the output. In the example shown before 𝑕 : was the identity. CS109B, P ROTOPAPAS , G LICKMAN 40

Backprop Through Time 𝑧 < t-1 𝑧 < t 𝑧 < t-2 W W W U U U ℎ "&7 ℎ "&' ℎ " V V V 𝑌 "&7 𝑌 "&' 𝑌 " CS109B, P ROTOPAPAS , G LICKMAN 41

Lecture 10: Recurrent Neural Networks CS109B Data Science 2 Pavlos - PowerPoint PPT Presentation

Lecture 10: Recurrent Neural Networks CS109B Data Science 2 Pavlos Protopapas and Mark Glickman Sequence Modeling: Handwritten Text Translation Input : Image Output: Text

CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I :

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

CS6501: Deep Learning for Visual Recognition Recurrent Neural Networks (RNNs) Todays Class

CS6501: Deep Learning for Visual Recognition Recurrent Neural Networks (RNNs) Todays Class

The Power of Linear Recurrent Neural Networks Neural Networks Was knnen lineare rekurrente

Recurrent Neural Network Xiaogang Wang xgwang@ee.cuhk.edu.hk February 26, 2019 cuhk Xiaogang

CHAPTER VII VII CHAPTER Learning in Recurrent Networks Learning in Recurrent Networks CHAPTER

Recurrent Neural Networks Greg Mori - CMPT 419/726 Goodfellow, Bengio, and Courville: Deep

Understanding LSTM Networks Recurrent Neural Networks An unrolled recurrent neural network The

CSEP 517: Natural Language Processing Recurrent Neural Networks Autumn 2018 Luke Zettlemoyer

Lecture 4: Recurrent neural networks for natural language processing Plan of the lecture Part

Lecture 9: Recurrent Neural Networks Princeton University COS 495 Instructor: Yingyu Liang

CSC413/2516 Lecture 7: Generalization & Recurrent Neural Networks Jimmy Ba Jimmy Ba

Recurrent Neural Networks CS60010: Deep Learning Abir Das IIT Kharagpur Mar 11, 2020

Computa(on through dynamics Using recurrent neural networks to unveil mechanism in neural

IN5550 Neural Methods in Natural Language Processing Recurrent Neural Networks Stephan Oepen

ICON Clinical Research SAS How to standardize solutions to recurrent issues PhUSE Conference

FAA Compliance Guidance Letter (CGL) Appraisal Standards for the Sale and Disposal of Federally

Presented at 2014 ICEAA Professional Development & Training Workshop June 2014 Caleb Fleming

Real world evidence (RWE) an introduction; how is it relevant for the medicines regulatory

Q1 FY2018 RESULTS PRESENTATION 8 February 2018 Good start of the year Delivered strong

Recurrent Neural Networks Sharan Narang May 9, 2017 Silicon Valley AI Lab Speech Recognition

Recurrent Neural Networks with Flexible Gates using Kernel Activation Functions Authors : S.

THE ILLINOIS FREEDOM OF INFORMATION ACT 1 Disclaimer The Illinois and Federal Freedom of

Lecture 10: Recurrent Neural Networks CS109B Data Science 2 Pavlos - PowerPoint PPT Presentation

Lecture 10: Recurrent Neural Networks CS109B Data Science 2 Pavlos Protopapas and Mark Glickman Sequence Modeling: Handwritten Text Translation Input : Image Output: Text

CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I :

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

CS6501: Deep Learning for Visual Recognition Recurrent Neural Networks (RNNs) Todays Class

CS6501: Deep Learning for Visual Recognition Recurrent Neural Networks (RNNs) Todays Class

The Power of Linear Recurrent Neural Networks Neural Networks Was knnen lineare rekurrente

Recurrent Neural Network Xiaogang Wang xgwang@ee.cuhk.edu.hk February 26, 2019 cuhk Xiaogang

CHAPTER VII VII CHAPTER Learning in Recurrent Networks Learning in Recurrent Networks CHAPTER

Recurrent Neural Networks Greg Mori - CMPT 419/726 Goodfellow, Bengio, and Courville: Deep

Understanding LSTM Networks Recurrent Neural Networks An unrolled recurrent neural network The

CSEP 517: Natural Language Processing Recurrent Neural Networks Autumn 2018 Luke Zettlemoyer

Lecture 4: Recurrent neural networks for natural language processing Plan of the lecture Part

Lecture 9: Recurrent Neural Networks Princeton University COS 495 Instructor: Yingyu Liang

CSC413/2516 Lecture 7: Generalization &amp; Recurrent Neural Networks Jimmy Ba Jimmy Ba

Recurrent Neural Networks CS60010: Deep Learning Abir Das IIT Kharagpur Mar 11, 2020

Computa(on through dynamics Using recurrent neural networks to unveil mechanism in neural

IN5550 Neural Methods in Natural Language Processing Recurrent Neural Networks Stephan Oepen

ICON Clinical Research SAS How to standardize solutions to recurrent issues PhUSE Conference

FAA Compliance Guidance Letter (CGL) Appraisal Standards for the Sale and Disposal of Federally

Presented at 2014 ICEAA Professional Development &amp; Training Workshop June 2014 Caleb Fleming

Real world evidence (RWE) an introduction; how is it relevant for the medicines regulatory

Q1 FY2018 RESULTS PRESENTATION 8 February 2018 Good start of the year Delivered strong

Recurrent Neural Networks Sharan Narang May 9, 2017 Silicon Valley AI Lab Speech Recognition

Recurrent Neural Networks with Flexible Gates using Kernel Activation Functions Authors : S.

THE ILLINOIS FREEDOM OF INFORMATION ACT 1 Disclaimer The Illinois and Federal Freedom of

CSC413/2516 Lecture 7: Generalization & Recurrent Neural Networks Jimmy Ba Jimmy Ba

Presented at 2014 ICEAA Professional Development & Training Workshop June 2014 Caleb Fleming