Recurrent Neural Networks Graham Neubig Site - PowerPoint PPT Presentation

Representing Contexts I hate this movie RNN RNN RNN RNN predict predict predict predict label label label label • Tagging • Language Modeling • Calculating Representations for Parsing, etc.

e.g. Language Modeling • Language modeling is like a tagging task, where each tag is the next word!

e.g. Language Modeling <s> • Language modeling is like a tagging task, where each tag is the next word!

e.g. Language Modeling <s> RNN • Language modeling is like a tagging task, where each tag is the next word!

e.g. Language Modeling <s> RNN predict I • Language modeling is like a tagging task, where each tag is the next word!

e.g. Language Modeling <s> I RNN predict I • Language modeling is like a tagging task, where each tag is the next word!

e.g. Language Modeling <s> I RNN RNN predict I • Language modeling is like a tagging task, where each tag is the next word!

e.g. Language Modeling <s> I RNN RNN predict predict I hate • Language modeling is like a tagging task, where each tag is the next word!

e.g. Language Modeling <s> I hate RNN RNN predict predict I hate • Language modeling is like a tagging task, where each tag is the next word!

e.g. Language Modeling <s> I hate RNN RNN RNN predict predict I hate • Language modeling is like a tagging task, where each tag is the next word!

e.g. Language Modeling <s> I hate RNN RNN RNN predict predict predict I hate this • Language modeling is like a tagging task, where each tag is the next word!

e.g. Language Modeling <s> I hate this RNN RNN RNN predict predict predict I hate this • Language modeling is like a tagging task, where each tag is the next word!

e.g. Language Modeling <s> I hate this RNN RNN RNN RNN predict predict predict I hate this • Language modeling is like a tagging task, where each tag is the next word!

e.g. Language Modeling <s> I hate this RNN RNN RNN RNN predict predict predict predict I hate this movie • Language modeling is like a tagging task, where each tag is the next word!

e.g. Language Modeling <s> I hate this movie RNN RNN RNN RNN predict predict predict predict I hate this movie • Language modeling is like a tagging task, where each tag is the next word!

e.g. Language Modeling <s> I hate this movie RNN RNN RNN RNN RNN predict predict predict predict I hate this movie • Language modeling is like a tagging task, where each tag is the next word!

e.g. Language Modeling <s> I hate this movie RNN RNN RNN RNN RNN predict predict predict predict predict I hate this movie </s> • Language modeling is like a tagging task, where each tag is the next word!

Bi-RNNs • A simple extension, run the RNN in both directions I hate this movie RNN RNN RNN RNN RNN RNN RNN RNN

Bi-RNNs • A simple extension, run the RNN in both directions I hate this movie RNN RNN RNN RNN RNN RNN RNN RNN concat

Bi-RNNs • A simple extension, run the RNN in both directions I hate this movie RNN RNN RNN RNN RNN RNN RNN RNN concat softmax PRN

Bi-RNNs • A simple extension, run the RNN in both directions I hate this movie RNN RNN RNN RNN RNN RNN RNN RNN concat concat softmax PRN

Bi-RNNs • A simple extension, run the RNN in both directions I hate this movie RNN RNN RNN RNN RNN RNN RNN RNN concat concat softmax softmax PRN VB

Bi-RNNs • A simple extension, run the RNN in both directions I hate this movie RNN RNN RNN RNN RNN RNN RNN RNN concat concat concat softmax softmax PRN VB

Bi-RNNs • A simple extension, run the RNN in both directions I hate this movie RNN RNN RNN RNN RNN RNN RNN RNN concat concat concat softmax softmax softmax PRN VB DET

Bi-RNNs • A simple extension, run the RNN in both directions I hate this movie RNN RNN RNN RNN RNN RNN RNN RNN concat concat concat concat softmax softmax softmax PRN VB DET

Bi-RNNs • A simple extension, run the RNN in both directions I hate this movie RNN RNN RNN RNN RNN RNN RNN RNN concat concat concat concat softmax softmax softmax softmax PRN VB DET NN

Let’s Try it Out!

Recurrent Neural Networks in DyNet

Recurrent Neural Networks in DyNet • Based on “*Builder” class (*=SimpleRNN/LSTM)

Recurrent Neural Networks in DyNet • Based on “*Builder” class (*=SimpleRNN/LSTM) • Add parameters to model (once): # LSTM (layers=1, input=64, hidden=128, model) RNN = dy.SimpleRNNBuilder(1, 64, 128, model)

Recurrent Neural Networks in DyNet • Based on “*Builder” class (*=SimpleRNN/LSTM) • Add parameters to model (once): # LSTM (layers=1, input=64, hidden=128, model) RNN = dy.SimpleRNNBuilder(1, 64, 128, model) • Add parameters to CG and get initial state (per sentence): s = RNN.initial_state()

Recurrent Neural Networks in DyNet • Based on “*Builder” class (*=SimpleRNN/LSTM) • Add parameters to model (once): # LSTM (layers=1, input=64, hidden=128, model) RNN = dy.SimpleRNNBuilder(1, 64, 128, model) • Add parameters to CG and get initial state (per sentence): s = RNN.initial_state() • Update state and access (per input word/character): s = s.add_input(x_t) h_t = s.output()

RNNLM Example: Parameter Initialization # Lookup parameters for word embeddings WORDS_LOOKUP = model.add_lookup_parameters((nwords, 64)) # Word-level RNN (layers=1, input=64, hidden=128, model) RNN = dy.SimpleRNNBuilder(1, 64, 128, model) # Softmax weights/biases on top of RNN outputs W_sm = model.add_parameters((nwords, 128)) b_sm = model.add_parameters(nwords)

RNNLM Example: Sentence Initialization # Build the language model graph def calc_lm_loss(wids): dy.renew_cg() # parameters -> expressions W_exp = dy.parameter(W_sm) b_exp = dy.parameter(b_sm) # add parameters to CG and get state f_init = RNN.initial_state() # get the word vectors for each word ID wembs = [WORDS_LOOKUP[wid] for wid in wids] # Start the rnn by inputting "<s>" s = f_init.add_input(wembs[-1]) …

RNNLM Example: Loss Calculation and State Update … # process each word ID and embedding losses = [] for wid, we in zip(wids, wembs): # calculate and save the softmax loss score = W_exp * s.output() + b_exp loss = dy.pickneglogsoftmax(score, wid) losses.append(loss) # update the RNN state with the input s = s.add_input(we) # return the sum of all losses return dy.esum(losses)

Code Examples sentiment-rnn.py

Recurrent Neural Networks Graham Neubig Site - PowerPoint PPT Presentation

CS11-747 Neural Networks for NLP Recurrent Neural Networks Graham Neubig Site https://phontron.com/class/nn4nlp2017/ NLP and Sequential Data NLP and Sequential Data NLP is full of sequential data NLP and Sequential Data NLP is full of

CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I :

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

CS6501: Deep Learning for Visual Recognition Recurrent Neural Networks (RNNs) Todays Class

CS6501: Deep Learning for Visual Recognition Recurrent Neural Networks (RNNs) Todays Class

The Power of Linear Recurrent Neural Networks Neural Networks Was knnen lineare rekurrente

Recurrent Neural Network Xiaogang Wang xgwang@ee.cuhk.edu.hk February 26, 2019 cuhk Xiaogang

CHAPTER VII VII CHAPTER Learning in Recurrent Networks Learning in Recurrent Networks CHAPTER

Recurrent Neural Networks Greg Mori - CMPT 419/726 Goodfellow, Bengio, and Courville: Deep

Understanding LSTM Networks Recurrent Neural Networks An unrolled recurrent neural network The

CSEP 517: Natural Language Processing Recurrent Neural Networks Autumn 2018 Luke Zettlemoyer

Recurrent Neural Networks CS60010: Deep Learning Abir Das IIT Kharagpur Mar 11, 2020

Computa(on through dynamics Using recurrent neural networks to unveil mechanism in neural

IN5550 Neural Methods in Natural Language Processing Recurrent Neural Networks Stephan Oepen

NLP Programming Tutorial 8 - Recurrent Neural Nets Graham Neubig Nara Institute of Science and

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Introduction to Recurrent Neural Networks Jakob Verbeek Modeling sequential data with Recurrent

CSC421/2516 Lecture 13: Recurrent Neural Networks Roger Grosse and Jimmy Ba Roger Grosse and

Arrp A Functional Language with Multi-dimensional Signals and Recurrence Equations Jakob Leben,

Polynomial Solutions of Recurrence Relations O. Shkaravska M. van Eekelen A. Tamalet Digital

The Robbins phenomenon: p -adic stability of some nonlinear recurrences Kiran S. Kedlaya in joint

Dropout in RNNs Following a VI Interpretation Yarin Gal yg279@cam.ac.uk Unless specified

Unsupervised Recurrent Neural Network Grammars Yoon Kim Alexander Rush Lei Yu Adhiguna Kuncoro

Recurrent Language Models CMSC 470 Marine Carpuat Toward a Neural Language Model Figures by

Recurrent Neural Networks (RNN) Pr. Fabien MOUTARDE Center for Robotics MINES ParisTech PSL