recurrent neural networks
play

Recurrent Neural Networks CS 6956: Deep Learning for NLP Overview - PowerPoint PPT Presentation

Recurrent Neural Networks CS 6956: Deep Learning for NLP Overview 1. Modeling sequences 2. Recurrent neural networks: An abstraction 3. Usage patterns for RNNs 4. BiDirectional RNNs 5. A concrete example: The Elman RNN 6. The vanishing


  1. Recurrent Neural Networks CS 6956: Deep Learning for NLP

  2. Overview 1. Modeling sequences 2. Recurrent neural networks: An abstraction 3. Usage patterns for RNNs 4. BiDirectional RNNs 5. A concrete example: The Elman RNN 6. The vanishing gradient problem 7. Long short-term memory units 1

  3. Overview 1. Modeling sequences 2. Recurrent neural networks: An abstraction 3. Usage patterns for RNNs 4. BiDirectional RNNs 5. A concrete example: The Elman RNN 6. The vanishing gradient problem 7. Long short-term memory units 2

  4. What can we do with such an abstraction? 1. The encoder: Convert a sequence into a feature vector for subsequent classification 2. A generator: Produce a sequence using an initial state 3. A transducer: Convert a sequence into another sequence 4. A conditioned generator (or an encoder-decoder): Combine 1 and 2

  5. 1. An Encoder Convert a sequence into a feature vector for subsequent classification Initial state I like cake 4

  6. 1. An Encoder Convert a sequence into a feature vector for subsequent classification A neural network Initial state I like cake 5

  7. 1. An Encoder Convert a sequence into a feature vector for subsequent classification loss A neural network Initial state I like cake 6

  8. 1. An Encoder Convert a sequence into a feature vector for subsequent classification Example: Encode a sentence or a phrase into a feature vector for a classification task such as sentiment classification loss A neural network Initial state I like cake 7

  9. 2. A Generator Produce a sequence using an initial state I like cake Initial state ∅ ∅ ∅ 8

  10. 2. A Generator Produce a sequence using an initial state loss I like cake Initial state ∅ ∅ ∅ 9

  11. 2. A Generator Produce a sequence using an initial state Maybe the previous output becomes the current input loss I like cake Initial state ∅ I like 10

  12. 2. A Generator Produce a sequence using an initial state Examples: Text generation tasks loss I like cake Initial state ∅ I like 11

  13. 3. A Transducer Convert a sequence into another sequence Verb Pronoun Noun Initial state I like cake 12

  14. 3. A Transducer Convert a sequence into another sequence loss Verb Pronoun Noun Initial state I like cake 13

  15. 4. Conditioned generator Or an encoder-decoder: First encode a sequence, then generate another one First encode a sequence Initial state I like cake 14

  16. 4. Conditioned generator Or an encoder-decoder: First encode a sequence, then generate another one Then decode it to produce a different sequence मला आवडतो केक Initial state I ∅ ∅ like cake ∅ 15

  17. 4. Conditioned generator Or an encoder-decoder: First encode a sequence, then generate another one Example: A building block for neural machine translation मला आवडतो केक Initial state I ∅ ∅ like cake ∅ 16

  18. Stacking RNNs • A commonly seen usage pattern • An RNN takes an input sequence and produces an output sequence • The input to an RNN can itself be the output of an RNN – stacked RNNs, also called deep RNNs • Two or more layers often seems to improve prediction performance 17

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend