Lecture 14: Recurrent Neural Networks CS109B Data Science 2 Pavlos - PowerPoint PPT Presentation

Lecture 14: Recurrent Neural Networks CS109B Data Science 2 Pavlos Protopapas, Mark Glickman, and Chris Tanner

Online lectures guidelines We would prefer you have your video on, but it is OK if you have • it off. We would prefer you have your real name. • • All lectures, labs and a-sections will be live streamed and als available for viewing later on canvas/zoom. • We will have course staff in the chat online and during lecture you can also make use of this spreadsheet to enter your own questions or 'up vote' those of your fellow students. • Quizzed will be available for 24 hours. CS109B, P ROTOPAPAS , G LICKMAN , T ANNER 2

Outline Why Recurrent Neural Networks (RNNs) Main Concept of RNNs More Details of RNNs RNN training Gated RNN CS109B, P ROTOPAPAS , G LICKMAN , T ANNER 3

CS109B, P ROTOPAPAS , G LICKMAN , T ANNER CS109B, P ROTOPAPAS , G LICKMAN , T ANNER 4 1

Background Many classification and regression tasks involve data that is assumed to be in independent and id identica ically ly dis istrib ibuted (i. i.i. i.d.). .). For example: De Detecting l lung c cancer Fa Face ce reco ecogni nition Ri Risk of heart attack CS109B, P ROTOPAPAS , G LICKMAN , T ANNER 5

Background Much of our data is inherently se sequential scale examples Na Natural disasters (e.g., earthquakes) WORLD LD Climate c Cl change St Stock ck market ket HUMA HU MANITY Vi Virus outbreaks Sp Speech eech reco ecogni nition INDIVIDUAL L PEOPLE LE Ma Machine Tra ranslation (e.g., Engli lish -> F > French) Ca Cancer t treatment CS109B, P ROTOPAPAS , G LICKMAN , T ANNER 6

Background Much of our data is inherently seq sequenti uential PR PREDICTI TING EA EARTHQU HQUAKES KES CS109B, P ROTOPAPAS , G LICKMAN , T ANNER 7

Background Much of our data is inherently seq sequenti uential STO STOCK MA MARKET PR PREDICTI TIONS CS109B, P ROTOPAPAS , G LICKMAN , T ANNER 8

Background Much of our data is inherently seq sequenti uential SPE SPEECH RECOGNITI TION “What is the weather today?” “What is the weather two day?” “What is the whether too day?” “What is, the Wrether to Dae?” CS109B, P ROTOPAPAS , G LICKMAN , T ANNER 9

Sequence Modeling: Handwritten Text Input : Image • Output: Text • https://towardsdatascience.com/build-a-handwritten-text-recognition-system-using-tensorflow- 2326a3487cd5 CS109B, P ROTOPAPAS , G LICKMAN , T ANNER 10

Sequence Modeling: Text-to-Speech Input : Text • Output: Audio • CS109B, P ROTOPAPAS , G LICKMAN , T ANNER 11

Sequence Modeling: Machine Translation Input : Text • Output: Translated Text • CS109B, P ROTOPAPAS , G LICKMAN , T ANNER 12

Outline Why RNNs Main Concept of RNNs (part 1) More Details of RNNs RNN training Gated RNN CS109B, P ROTOPAPAS , G LICKMAN , T ANNER 13

CS109B, P ROTOPAPAS , G LICKMAN , T ANNER 14

What can my NN do? Training : Present to the NN examples and learn from them. [George, Mary, Tom, Suzie] NN CS109B, P ROTOPAPAS , G LICKMAN , T ANNER 15

What can my NN do? Prediction : Given an example NN George NN Mary CS109B, P ROTOPAPAS , G LICKMAN , T ANNER 16

What my NN can NOT do? ? WHO IS IT? CS109B, P ROTOPAPAS , G LICKMAN , T ANNER 17

Learn from previous examples Time CS109B, P ROTOPAPAS , G LICKMAN , T ANNER 18

Recurrent Neural Network (RNN) George NN CS109B, P ROTOPAPAS , G LICKMAN , T ANNER 19

Recurrent Neural Network (RNN) I have seen George moving in this way before. George NN RNNs recognize the data's sequential characteristics and use patterns to predict the next likely scenario. CS109B, P ROTOPAPAS , G LICKMAN , T ANNER 20

Recurrent Neural Network (RNN) I do not know. I need to know WHO IS He told me I could have it who said that and what he HE? said before. Can you tell me more? Our model requires context - or contextual information - to understand the subject (he) and the direct object (it) in the sentence. CS109B, P ROTOPAPAS , G LICKMAN , T ANNER 21

RNN – Another Example with Text I see what you mean now! - Hellen: Nice sweater Joe. WHO IS The noun “he” stands for - Joe: Thanks, Hellen. It used Joe’s brother while ”it” for to belong to my brother and he HE? the sweater. told me I could have it . After providing sequential information, the model recognize the subject (Joe’s brother) and the object (sweater) in the sentence. CS109B, P ROTOPAPAS , G LICKMAN , T ANNER 22

Batch_size = 2048 CS109B, P ROTOPAPAS , G LICKMAN , T ANNER 23

Sequences • We want a machine learning model to understand sequences, not isolated samples. • Can MLP do this? • Assume we have a sequence of temperature measurements and we want to take 3 sequential measurements and predict the next one features 35 1 32 2 45 3 samples 48 4 41 5 39 6 36 7 … … CS109B, P ROTOPAPAS , G LICKMAN , T ANNER 24

Sequences • We want a machine learning model to understand sequences, not isolated samples. • Can MLP do this? • Assume we have a sequence of temperature measurements and we want to take 3 sequential measurements and predict the next one features 35 35 1 1 32 32 2 2 45 45 3 3 samples 48 48 4 4 41 5 39 6 36 7 … … CS109B, P ROTOPAPAS , G LICKMAN , T ANNER 25

Sequences • We want a machine learning model to understand sequences, not isolated samples. • Can MLP do this? • Assume we have a sequence of temperature measurements and we want to take 3 sequential measurements and predict the next one features 35 35 32 1 1 2 32 32 45 2 2 3 45 45 48 3 3 4 samples 48 48 41 4 4 5 41 5 39 6 36 7 … … CS109B, P ROTOPAPAS , G LICKMAN , T ANNER 26

Sequences • We want a machine learning model to understand sequences, not isolated samples. • Can MLP do this? • Assume we have a sequence of temperature measurements and we want to take 3 sequential measurements and predict the next one features 35 35 32 45 1 1 2 3 32 32 45 48 2 2 3 4 45 45 48 41 3 3 4 5 samples 48 48 41 39 4 4 5 6 41 5 39 6 36 7 … … CS109B, P ROTOPAPAS , G LICKMAN , T ANNER 27

Sequences • We want a machine learning model to understand sequences, not isolated samples. • Can MLP do this? • Assume we have a sequence of temperature measurements and we want to take 3 sequential measurements and predict the next one features 35 35 32 45 1 1 2 3 32 32 45 48 2 2 3 4 45 45 48 41 3 3 4 5 samples 48 48 41 39 4 4 5 6 41 5 39 6 36 7 … … CS109B, P ROTOPAPAS , G LICKMAN , T ANNER 28

Windowed dataset This is called overlapping windowed dataset, since we’re windowing observations to create new. We can easily do using a MLS: 1 3 10 ReLU 10 ReLU 1 ReLU But re-arranging the order of the inputs like: 45 32 41 3 2 5 35 48 48 1 4 4 32 45 45 2 3 3 4 5 6 will produce the same results CS109B, P ROTOPAPAS , G LICKMAN , T ANNER 29

Why not CNNs or MLPs? 1. MLPs/CNNs require fixed input and output size 2. MLPs/CNNs can’t classify inputs in multiple places CS109B, P ROTOPAPAS , G LICKMAN , T ANNER 30

Windowed dataset What follows after: `I got in the car and’ ? `drove away’ What follows after : `In car the and I got’ ? Not obvious that it should be ` drove away’ The order of words matters. This is true for most sequential data. A fully connected network will not distinguish the order and therefore missing some information. CS109B, P ROTOPAPAS , G LICKMAN , T ANNER 31

CS109B, P ROTOPAPAS , G LICKMAN , T ANNER 32

Outline Why RNNs Main Concept of RNNs More Details of RNNs RNN training Gated RNN CS109B, P ROTOPAPAS , G LICKMAN , T ANNER 33

Memory Somehow the computational unit should remember what it has seen before. 𝑍 " Should remember 𝑌 $ … 𝑌 "&' Unit 𝑌 " CS109B, P ROTOPAPAS , G LICKMAN , T ANNER 34

Memory Somehow the computational unit should remember what it has seen before. 𝑍 " Unit Internal memory 𝑌 " CS109B, P ROTOPAPAS , G LICKMAN , T ANNER 35

Memory Somehow the computational unit should remember what it has seen before. We’ll call the information the unit’s state . 𝑍 " RNN Internal memory 𝑌 " CS109B, P ROTOPAPAS , G LICKMAN , T ANNER 36

Memory In neural networks, once training is over, the weights do not change. This means that the network is done learning and done changing. Then, we feed in values, and it simply applies the operations that make up the network, using the values it has learned. But the RNN units can remember new information after training has completed. That is, they’re able to keep changing after training is over. CS109B, P ROTOPAPAS , G LICKMAN , T ANNER 37

Memory Question : How can we do this? How can build a unit that remembers the past? The memory or state can be written to a file but in RNNs, we keep it inside the recurrent unit. In an array or in a vector! Work with an example: Anna Sofia said her shoes are too ugly. Her here means Anna Sofia. Nikolas put his keys on the table. His here means Nikolas CS109B, P ROTOPAPAS , G LICKMAN , T ANNER 38

Lecture 14: Recurrent Neural Networks CS109B Data Science 2 Pavlos - PowerPoint PPT Presentation

Lecture 14: Recurrent Neural Networks CS109B Data Science 2 Pavlos Protopapas, Mark Glickman, and Chris Tanner Online lectures guidelines We would prefer you have your video on, but it is OK if you have it off. We would prefer you have

CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I :

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

CS6501: Deep Learning for Visual Recognition Recurrent Neural Networks (RNNs) Todays Class

CS6501: Deep Learning for Visual Recognition Recurrent Neural Networks (RNNs) Todays Class

The Power of Linear Recurrent Neural Networks Neural Networks Was knnen lineare rekurrente

Recurrent Neural Network Xiaogang Wang xgwang@ee.cuhk.edu.hk February 26, 2019 cuhk Xiaogang

CHAPTER VII VII CHAPTER Learning in Recurrent Networks Learning in Recurrent Networks CHAPTER

Recurrent Neural Networks Greg Mori - CMPT 419/726 Goodfellow, Bengio, and Courville: Deep

Understanding LSTM Networks Recurrent Neural Networks An unrolled recurrent neural network The

CSEP 517: Natural Language Processing Recurrent Neural Networks Autumn 2018 Luke Zettlemoyer

Lecture 4: Recurrent neural networks for natural language processing Plan of the lecture Part

Lecture 9: Recurrent Neural Networks Princeton University COS 495 Instructor: Yingyu Liang

CSC413/2516 Lecture 7: Generalization & Recurrent Neural Networks Jimmy Ba Jimmy Ba

Recurrent Neural Networks CS60010: Deep Learning Abir Das IIT Kharagpur Mar 11, 2020

Computa(on through dynamics Using recurrent neural networks to unveil mechanism in neural

IN5550 Neural Methods in Natural Language Processing Recurrent Neural Networks Stephan Oepen

EFFECT OF ALTERNATIVE TARP AND FUMIGANTS COMBINATIONS WITH METAM SODIUM ON STRAWBERRY YIELD. S.

Discussion Examples for Sequential and Combined IFM -RUC Scott Harvey Member California ISO

reemerging malaria in areas of unstable transmission Kumar V. Udhayakumar, Ph.D Malaria Branch,

E V ME ER RG GE EN NC CY Y EH HI IC CL LE E T E II I T I EC CH HN NI IC CI

StreamDM: Advanced data science with Spark Streaming Heitor Murilo Gomes and Albert Bifet About

Criminal Justice Off-Ramps: The Sequential Intercept Map and Interventions that Matter Sept 1,

!"#$%&'()+&,(&-.$/-((+0.0123$ &.$4+-)5$4-67)(&.8$9-5*:$

Zolgensma Approval and Access June 2019 After Approval - Access 1. Key Issues: A. Sites and

Lecture 14: Recurrent Neural Networks CS109B Data Science 2 Pavlos - PowerPoint PPT Presentation

Lecture 14: Recurrent Neural Networks CS109B Data Science 2 Pavlos Protopapas, Mark Glickman, and Chris Tanner Online lectures guidelines We would prefer you have your video on, but it is OK if you have it off. We would prefer you have

CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I :

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

CS6501: Deep Learning for Visual Recognition Recurrent Neural Networks (RNNs) Todays Class

CS6501: Deep Learning for Visual Recognition Recurrent Neural Networks (RNNs) Todays Class

The Power of Linear Recurrent Neural Networks Neural Networks Was knnen lineare rekurrente

Recurrent Neural Network Xiaogang Wang xgwang@ee.cuhk.edu.hk February 26, 2019 cuhk Xiaogang

CHAPTER VII VII CHAPTER Learning in Recurrent Networks Learning in Recurrent Networks CHAPTER

Recurrent Neural Networks Greg Mori - CMPT 419/726 Goodfellow, Bengio, and Courville: Deep

Understanding LSTM Networks Recurrent Neural Networks An unrolled recurrent neural network The

CSEP 517: Natural Language Processing Recurrent Neural Networks Autumn 2018 Luke Zettlemoyer

Lecture 4: Recurrent neural networks for natural language processing Plan of the lecture Part

Lecture 9: Recurrent Neural Networks Princeton University COS 495 Instructor: Yingyu Liang

CSC413/2516 Lecture 7: Generalization &amp; Recurrent Neural Networks Jimmy Ba Jimmy Ba

Recurrent Neural Networks CS60010: Deep Learning Abir Das IIT Kharagpur Mar 11, 2020

Computa(on through dynamics Using recurrent neural networks to unveil mechanism in neural

IN5550 Neural Methods in Natural Language Processing Recurrent Neural Networks Stephan Oepen

EFFECT OF ALTERNATIVE TARP AND FUMIGANTS COMBINATIONS WITH METAM SODIUM ON STRAWBERRY YIELD. S.

Discussion Examples for Sequential and Combined IFM -RUC Scott Harvey Member California ISO

reemerging malaria in areas of unstable transmission Kumar V. Udhayakumar, Ph.D Malaria Branch,

E V ME ER RG GE EN NC CY Y EH HI IC CL LE E T E II I T I EC CH HN NI IC CI

StreamDM: Advanced data science with Spark Streaming Heitor Murilo Gomes and Albert Bifet About

Criminal Justice Off-Ramps: The Sequential Intercept Map and Interventions that Matter Sept 1,

!&quot;#$%&amp;'()*+&amp;,*(&amp;-.$/-((+0.0123$ &amp;.$4+-)5$4-67)(&amp;.8$9-5*:$

Zolgensma Approval and Access June 2019 After Approval - Access 1. Key Issues: A. Sites and

CSC413/2516 Lecture 7: Generalization & Recurrent Neural Networks Jimmy Ba Jimmy Ba

!"#$%&'()+&,(&-.$/-((+0.0123$ &.$4+-)5$4-67)(&.8$9-5*:$