Deep Learning Research for NLP Graham Neubig Language Processing - PowerPoint PPT Presentation

LTI Orientation Deep Learning Research   for NLP Graham Neubig

Language Processing Mary prevents Peter from scoring a goal. John passes the ball upfield to Peter, who shoots for the goal. The shot is deflected by Mary and the ball goes out of bounds.

Structured Prediction X Y

Supervised Learning X X Y Y

Supervised Learning w/ Neural Nets Learning θ X X Y Y

Structured Prediction w/ Neural Nets

Neural Structured Prediction w Y Model X Loss

Neural Structured Prediction w Y Model X Search Loss

The Problem of Discrete Decisions y i ( w ) = ‘dog’ ˆ y i ( w ) = ‘the’ ˆ y i ( w ) = ‘cat’ ˆ g ( w ) w

Soft Search [Goyal+18]   (Faculty: Neubig) y i = ‘the’ ˆ argmax { e ( dog ) s i ( dog ) peaked softmax { e ( y ) · exp[ α · s i ( y )] e ( the ) X s i ( the ) Z y e ( cat ) α -soft argmax s i ( cat ) ˜ e i h i +1 h i

Smoothed Surface y i ( w ) = ‘dog’ ˆ y i ( w ) = ‘the’ ˆ y i ( w ) = ‘cat’ ˆ α = 10 α = 1 g ( w ) w

Prediction over Word Embedding Space [Kumar+18]   (Faculty: Tsvetkov)

Structured Modeling w/ Neural Nets

Why Structured Neural Nets • In pre-neural NLP we did feature engineering to capture the salient features of text • Now, neural nets capture features for us • But given too much freedom they will not learn or overfit • So we do architecture engineering to add bias

Structure in Language Words Sentences S VP VP PP NP Alice gave a message to Bob Phrases Documents This film was completely unbelievable. The characters were wooden and the plot was absurd. That being said, I liked it.

BiLSTM Conditional Random Fields [Ma+15]   (Faculty: Hovy) • Add an additional layer that ensures consistency between tags <s> I hate this movie <s> PRP VBP DT NN • Training and prediction use dynamic programming

Neural Factor Graph Models [Malaviya+18] (Faculty: Gormley, Neubig) • Problem: Neural CRFs can only handle single tag/word • Idea: Expand to multiple tags using graphical models

Stack LSTM [Dyer+15] (Faculty: Dyer (now DeepMind)) RED-L(amod) SHIFT … SHIFT REDUCE_L REDUCE_R S B p t } {z | } {z | TOP TOP amod an decision was made ∅ root overhasty TOP | REDUCE-LEFT(amod) Compositional {z Representation A SHIFT } …

Morphological Language Models [Matthews+18]   (Faculty: Neubig, Dyer) • Problem: Language modeling for morphologically rich languages is hard • Idea: Specifically decompose input and output using morphological structure

Neural-Symbolic Integration

Neural-Symbolic Hybrids • Neural and symbolic models better at different things • Neural: smoothing over differences using similarity • Symbolic: remembering individual single-shot events • How can we combine the two?

Discrete Lexicons in Neural Seq2seq [Arthur+15] (Faculty: Neubig)

NNs + Logic Rules [Hu+16] (Faculty: Hovy, Xing) • Problem: It is difficult to explicitly incorporate knowledge into neural-net-based models • Idea: Use logical rules to constrain space of predicted probabilities

Latent Variable Models

Latent Variable Models X Y

Latent Variable Models X X Z Y ? Z ? ? Y

Neural Latent Variable Models Z X

Generating Text from Latent Space Z X X

Example: Discourse Level Modeling with VAE [Zhao+17] (Faculty: Eskenazi) • Use latent variable as a way to represent entire discourse in dialog

Handling Discrete Latent Variables [Zhou+17] (Faculty: Neubig)

Structured Latent Variables [Yin+18] (Faculty: Neubig) • Problem: Paucity of training data for structured prediction problems • Idea: Treat the structure as a latent variable in a VAE model

Better Learning Algorithms for Latent Variable Models [He+2019] (Faculty: Neubig) • Problem: When learning latent variable models, predicting the latent variables can be difficult • Solution: Perform aggressive update of the part of the model that predicts these variables

Any Questions?

Deep Learning Research for NLP Graham Neubig Language Processing - PowerPoint PPT Presentation

LTI Orientation Deep Learning Research for NLP Graham Neubig Language Processing Mary prevents Peter from scoring a goal. John passes the ball upfield to Peter, who shoots for the goal. The shot is deflected by Mary and the ball goes

SI485i : NLP Missing Topics and the Future Who cares about NLP? NLP has expanded quickly

SI425 : NLP Missing Topics and the Future Who cares about NLP? NLP has expanded quickly

Deep Learning for NLP Kiran Vodrahalli Feb 11, 2015 Overview What is NLP? Natural

Deep learning for NLP: Introduction CS 6956: Deep Learning for NLP Words are a very fantastical

Natural Language Processing with Deep Learning CS224N The Future of Deep Learning + NLP Kevin

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

NLP: Two pictures Wordnet and Word Sense Problem NLP Disambiguation Semantics NLP Trinity

Recurrent Neural Networks Graham Neubig Site https://phontron.com/class/nn4nlp2017/ NLP and

Hybrid NLP Hybrid NLP O UTLINE O UTLINE Problems of Deep and Shallow Processing

NLP Resource Creation and Enrichment using Deep Learning Kevin Patel Guided by: Prof. Shivaram

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

Introduction to Deep Processing Techniques for NLP Deep Processing Techniques for NLP Ling 571

Introduction to Deep Processing Techniques for NLP Deep Processing Techniques for NLP Ling 571

Question-Answering: Shallow & Deep Techniques for NLP Deep Processing Techniques for NLP

Ontologies for NLP NLP for Ontologies FOIS 2014 - LogOnto Workshop on Logics and Ontologies for

Compositional Transfinite Semantics of While Hrmel Nestra Institute of Computer Science

CSCI 5832 Natural Language Processing Lecture 21 Jim Martin 4/24/07 CSCI 5832 Spring 2007 1

Compositional Methods Alex Rabinovich Department of Computer Science Tel-Aviv University

Introduction to Predicate Logic Ling324 Reading: Meaning and Grammar , pg. 113-141 Usefulness of

Ch 7/10: Tables, Color paper: ArteryViz (carryforward from last time) to read

ASSIGNMENT AND LOOPS CSSE 120 Rose-Hulman Institute of Technology Outline (some of Chapters 2

CSE217 INTRODUCTION TO DATA SCIENCE LECTURE 2: EXPLORATORY DATA ANALYSIS Spring 2019 Marion

Review Relational, equality, and logical expressions evaluate to int values 1 (true) or 0

Sambuz

Useful Links

Newsletter

Mail Us

Deep Learning Research for NLP Graham Neubig Language Processing - PowerPoint PPT Presentation

LTI Orientation Deep Learning Research for NLP Graham Neubig Language Processing Mary prevents Peter from scoring a goal. John passes the ball upfield to Peter, who shoots for the goal. The shot is deflected by Mary and the ball goes

SI485i : NLP Missing Topics and the Future Who cares about NLP? NLP has expanded quickly

SI425 : NLP Missing Topics and the Future Who cares about NLP? NLP has expanded quickly

Deep Learning for NLP Kiran Vodrahalli Feb 11, 2015 Overview What is NLP? Natural

Deep learning for NLP: Introduction CS 6956: Deep Learning for NLP Words are a very fantastical

Natural Language Processing with Deep Learning CS224N The Future of Deep Learning + NLP Kevin

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

NLP: Two pictures Wordnet and Word Sense Problem NLP Disambiguation Semantics NLP Trinity

Recurrent Neural Networks Graham Neubig Site https://phontron.com/class/nn4nlp2017/ NLP and

Hybrid NLP Hybrid NLP O UTLINE O UTLINE Problems of Deep and Shallow Processing

NLP Resource Creation and Enrichment using Deep Learning Kevin Patel Guided by: Prof. Shivaram

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

Introduction to Deep Processing Techniques for NLP Deep Processing Techniques for NLP Ling 571

Introduction to Deep Processing Techniques for NLP Deep Processing Techniques for NLP Ling 571

Question-Answering: Shallow &amp; Deep Techniques for NLP Deep Processing Techniques for NLP

Ontologies for NLP NLP for Ontologies FOIS 2014 - LogOnto Workshop on Logics and Ontologies for

Compositional Transfinite Semantics of While Hrmel Nestra Institute of Computer Science

CSCI 5832 Natural Language Processing Lecture 21 Jim Martin 4/24/07 CSCI 5832 Spring 2007 1

Compositional Methods Alex Rabinovich Department of Computer Science Tel-Aviv University

Introduction to Predicate Logic Ling324 Reading: Meaning and Grammar , pg. 113-141 Usefulness of

Ch 7/10: Tables, Color paper: ArteryViz (carryforward from last time) to read

ASSIGNMENT AND LOOPS CSSE 120 Rose-Hulman Institute of Technology Outline (some of Chapters 2

CSE217 INTRODUCTION TO DATA SCIENCE LECTURE 2: EXPLORATORY DATA ANALYSIS Spring 2019 Marion

Review Relational, equality, and logical expressions evaluate to int values 1 (true) or 0

Sambuz

Useful Links

Newsletter

Mail Us

Question-Answering: Shallow & Deep Techniques for NLP Deep Processing Techniques for NLP