Recurrent Language Models CMSC 470 Marine Carpuat Toward a Neural - PowerPoint PPT Presentation

Oct 30, 2023 •114 likes •280 views

Recurrent Language Models CMSC 470 Marine Carpuat Toward a Neural Language Model Figures by Philipp Koehn (JHU) Count-based n-gram models vs. feedforward neural networks Pros of feedforward neural LM Word embeddings capture

Recurrent Language Models CMSC 470 Marine Carpuat
Toward a Neural Language Model Figures by Philipp Koehn (JHU)
Count-based n-gram models vs. feedforward neural networks • Pros of feedforward neural LM • Word embeddings capture generalizations across word typesq • Cons of feedforward neural LM • Closed vocabulary • Training/testing is more computationally expensive • Weaknesses of both types of model • Only work well for word prediction if the test corpus looks like the training corpus • Only capture short distance context
Language Modeling with Recurrent Neural Networks Figure by Philipp Koehn
Recurrent Neural Networks (RNN) The hidden layer includes a recurrent Unrolling the RNN over the time connection as part of its input sequence as a feed-forward network The hidden layer from the previous time step plays the role of memory, remembering earlier context Figures from Jurafsky & Martin
Unrolled RNN illustrated weights U, V, W are shared across all timesteps
Prediction/Inference with RNNs For language modeling, f = softmax function to provide normalized probability distribution over possible output classes
Training RNNs with backpropagation • Training goal: estimate parameter values for U, V, W • Use same loss as for feedforward language models • Given unrolled network, run forward and backpropagation algorithms as usual
Training RNNs with backpropagation
Practical Training Issues: vanishing/exploding gradients Multiple ways to work around this problem: - ReLU activations help - Dedicated RNN architecture (Long Short Term Memory Networks) Figure by Graham Neubig
Aside: Long Short Term Memory Networks
What do Recurrent Language Models Learn? Figure from Karpathy 2015
What do Recurrent Language Models Learn? Figure from Karpathy 2015
What do Recurrent Language Models Learn? • Parameters are hard to interpret, so we can gain insights by analyzing their output behavior instead • Can capture (some) long-distance dependencies After much economic progress over the years, the country has … The country, which has made much economic progress over the years, still has …
Recurrent neural network language models • Have all the strengths of feedforward language model • And do a better job at modeling long distance context • However • Training is trickier due to vanishing/exploding gradients • Performance on test sets is still sensitive to distance from training data

Recommend

CHAPTER VII VII CHAPTER Learning in Recurrent Networks Learning in Recurrent Networks CHAPTER

Ugur HALICI - METU EEE - ANKARA 11/18/2004 CHAPTER VII VII CHAPTER Learning in Recurrent Networks Learning in Recurrent Networks CHAPTER VI : VI : Learning in CHAPTER Learning in Recurrent Recurrent Networks Networks Introduction We

464 views • 17 slides

CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I :

Ugur HALICI - METU EEE - ANKARA 11/18/2004 CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I : Recurrent Neural Networks CHAPTER I Recurrent Neural Networks Introduction In this chapter first the

404 views • 27 slides

Recurrent Neural Network Xiaogang Wang xgwang@ee.cuhk.edu.hk February 26, 2019 cuhk Xiaogang

Recurrent Neural Network Xiaogang Wang xgwang@ee.cuhk.edu.hk February 26, 2019 cuhk Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 1 / 52 Outline 1 Recurrent neural networks Recurrent neural networks BP on RNN Variants

1.22k views • 52 slides

CS6501: Deep Learning for Visual Recognition Recurrent Neural Networks (RNNs) Todays Class

CS6501: Deep Learning for Visual Recognition Recurrent Neural Networks (RNNs) Todays Class Recurrent Neural Network Cell Recurrent Neural Networks (RNNs) Bi-Directional Recurrent Neural Networks (Bi-RNNs) Multiple-layer /

583 views • 47 slides

Introduction CSCE CSCE 496/896 496/896 Lecture 6: Lecture 6: Recurrent Recurrent CSCE

Introduction CSCE CSCE 496/896 496/896 Lecture 6: Lecture 6: Recurrent Recurrent CSCE 496/896 Lecture 6: Architectures Architectures All our architectures so far work on fixed-sized inputs Stephen Scott Stephen Scott Recurrent

576 views • 6 slides

CS6501: Deep Learning for Visual Recognition Recurrent Neural Networks (RNNs) Todays Class

793 views • 21 slides

Models of Language Evolution models thereof its evolution language Models of Language Evolution

Models of Language Evolution models thereof its evolution language Models of Language Evolution ? What is language? never start with a dictionary definition!! the language of Google search query completion A language is a dialect with

1.13k views • 30 slides

CSEP 517: Natural Language Processing Recurrent Neural Networks Autumn 2018 Luke Zettlemoyer

CSEP 517: Natural Language Processing Recurrent Neural Networks Autumn 2018 Luke Zettlemoyer University of Washington [most slides from Yejin Choi] RECURRENT NEURAL L NE NETWOR WORKS Recurrent Neural Networks (RNNs) Each input

787 views • 29 slides

Natural Language Processing with Deep Learning Language Modeling with Recurrent Neural Networks

Natural Language Processing with Deep Learning Language Modeling with Recurrent Neural Networks Navid Rekab-Saz navid.rekabsaz@jku.at Institute of Computational Perception Agenda Language Modeling with n- grams Recurrent Neural

667 views • 55 slides

Understanding LSTM Networks Recurrent Neural Networks An unrolled recurrent neural network The

Understanding LSTM Networks Recurrent Neural Networks An unrolled recurrent neural network The Problem of Long-Term Dependencies RNN short-term dependencies Language model trying to predict the next word based on the previous ones the clouds

729 views • 27 slides

Lecture 4: Recurrent neural networks for natural language processing Plan of the lecture Part

Neural Natural Language Processing Lecture 4: Recurrent neural networks for natural language processing Plan of the lecture Part 1 : Language modeling. Part 2 : Recurrent neural networks. Part 3 : Long-Short Term Memory (LSTM).

1.07k views • 78 slides

Recurrent Neural Networks for Language Modeling CSE392 - Spring 2019 Special Topic in CS Tasks

Recurrent Neural Networks for Language Modeling CSE392 - Spring 2019 Special Topic in CS Tasks Recurrent Neural Network and Language Modeling: Generate how? Sequence Models next word, sentence capture hidden representation of

674 views • 39 slides

Recurrent Language Models CMSC 470 Marine Carpuat Toward a Neural Language Model Figures by

Recurrent Language Models CMSC 470 Marine Carpuat Toward a Neural Language Model Figures by Philipp Koehn (JHU) Count-based n-gram models vs. feedforward neural networks Pros of feedforward neural LM Word embeddings capture

548 views • 16 slides

Recurrent Neural Models: Language Models, and Sequence Prediction and Generation CMSC 473/673

Recurrent Neural Models: Language Models, and Sequence Prediction and Generation CMSC 473/673 Frank Ferraro WARNING: Neural methods are NOT the only way to do sequence prediction: Structured Perceptron (478/678) Hidden Markov Models

1.03k views • 79 slides

4 Language Models 2: Log-linear Language Models This chapter will discuss another set of language

4 Language Models 2: Log-linear Language Models This chapter will discuss another set of language models: log-linear language models [9, 4], which take a very di ff erent approach than the count-based n -grams described above. 11 4.1 Model

502 views • 8 slides

Recurrent Neural Networks Greg Mori - CMPT 419/726 Goodfellow, Bengio, and Courville: Deep

Recurrent Neural Networks Long Short-Term Memory Temporal Convolutional Networks Examples Recurrent Neural Networks Greg Mori - CMPT 419/726 Goodfellow, Bengio, and Courville: Deep Learning textbook Ch. 10 Recurrent Neural Networks Long

667 views • 18 slides

Contact me about expert training for 10/24/2012 www.ellenfinkelstein.com teams and individuals!

Contact me about expert training for 10/24/2012 www.ellenfinkelstein.com teams and individuals! The people at Presentation XPert chose this topic, Take your slides from mediocre to memorable based on your responses to their questionnaire.

758 views • 51 slides

Joint Learning of Speech-Driven Facial Motion with Bidirectional Long-Short Term Memory N AJMEH S

Joint Learning of Speech-Driven Facial Motion with Bidirectional Long-Short Term Memory N AJMEH S ADOUGHI AND C ARLOS B USSO Multimodal Signal Processing (MSP) lab The University of Texas at Dallas Erik Jonsson School of Engineering and Computer

427 views • 26 slides

Student Success ICTCM 2020 Diane Hollister Diane.Hollister@pearson.com Presentation Title Arial

Student Success ICTCM 2020 Diane Hollister Diane.Hollister@pearson.com Presentation Title Arial Bold 7 pt 2 Academic Skill Deficits Study Skills Deficits Classroom Engagement Student Motivation What are your Peer-to-Peer

996 views • 50 slides

Recurrent Neural Networks Luke Zettlemoyer (Slides adapted from Danqi Chen, Chris Manning,

CSEP 517 Natural Language Processing Recurrent Neural Networks Luke Zettlemoyer (Slides adapted from Danqi Chen, Chris Manning, Abigail See, Andrej Karpathy) Overview What is a recurrent neural network (RNN)? Simple RNNs

859 views • 32 slides

Why UIs are like they are? Week 4 Are there any laws or theory that tell us how to design a user

Why UIs are like they are? Week 4 Are there any laws or theory that tell us how to design a user interface? The psychology of the user interface Human processor Human processor feedback Modeling humans as an information processing system

504 views • 21 slides

Deep Recurrent Q-Learning for Partially Observable MDPs Matthew Hausknecht and Peter Stone

Deep Recurrent Q-Learning for Partially Observable MDPs Matthew Hausknecht and Peter Stone University of Texas at Austin November 13, 2015 1 Motivation Intelligent decision making is the heart of AI 2 Motivation Intelligent decision making

808 views • 52 slides

Recurrent Neural Networks (RNN) Artificial Intelligence @ Allegheny College Janyl Jumadinova

Recurrent Neural Networks (RNN) Artificial Intelligence @ Allegheny College Janyl Jumadinova March 9, 2020 Alex Graves, Supervised Sequence Labelling with Recurrent Neural Networks http://colah.github.io/posts/2015-08-Understanding-LSTMs/

539 views • 18 slides

Cognitive Load Theory Why is learning to code so hard? What is cognitive load theory? Memory

Cognitive Load Theory Why is learning to code so hard? What is cognitive load theory? Memory https://learndojo.org/wp-content/uploads/2019/02/multistore-memory-model.png Long term memory Beginner Schema

199 views • 16 slides