From Dependency Parsing to Imitation Learning CMSC 723 / LING 723 / - PowerPoint PPT Presentation

From Dependency Parsing to Imitation Learning CMSC 723 / LING 723 / INST 725 Marine Carpuat Fig credits: Joakim Nivre, Yoav Goldberg, Hal Daume III

T oday’s topics: Addressing compounding error • Improving on gold parse oracle • Research highlight: [Goldberg & Nivre, 2012] • Imitation learning for structured prediction • CIML ch 18

Improving the oracle in transition-based dependency parsing • Issues with oracle we’ve used so far • Based on configuration sequence that produces gold tree • What if there are multiple sequences for a single gold tree? • How can we recover if the parser deviates from gold sequence? • Goldberg & Nivre [2012] propose an improved oracle

Exercise: which of these transition sequences produces the gold tree on the left?

Dependency Arc from position j to position i, Buffer Stack Arcs with dependency label l

Which of these transition sequences does the oracle algorithm produce?

SHIFT At test time, suppose the 4 th transition predicted is SHIFT instead of RA IOBJ What happens if we apply the oracle next?

Measuring distance from gold tree • Labeled attachment loss: number of arcs in gold tree that are not found in the predicted tree Loss = 1 Loss = 3

Proposed solution: 2 key changes to training algorithm Any transition that can possibly lead to a correct tree is considered correct Explore non-optimal transitions

Proposed solution: 2 key changes to training algorithm

Defining the cost of a transition • Loss difference between minimum loss trees achievable before and after transition • Loss for trees nicely decomposes into losses for arcs • We can compute transition cost by counting gold arcs that are no longer reachable after transition

T oday’s topics Addressing compounding error • Improving on gold parse oracle • Research highlight: [Goldberg & Nivre, 2012] • Imitation learning for structured prediction • CIML ch 18

Imitation Learning aka learning by demonstration • Sequential decision making problem • At each point in time 𝑢 • Receive input information 𝑦 𝑢 • Take action 𝑏 𝑢 • Suffer loss 𝑚 𝑢 • Move to next time step until time T • Goal • learn a policy function 𝑔(𝑦 𝑢 ) = 𝑧 𝑢 • That minimizes expected total loss over all trajectories enabled by f

Supervised Imitation Learning

Supervised Imitation Learning Problem with supervised approach: Compounding error

How can we train system to make better predictions off the expert path? • We want a policy f that leads to good performance in configurations that f encounters • A chicken and egg problem • Can be addressed by iterative approach

DAGGER: simple & effective imitation learning via Data AGGregation Requires interaction with expert!

When is DAGGER used in practice? • Interaction with expert is not always possible • Classic use case • Expert = slow algorithm • Use DAGGER to learn a faster algorithm that imitates expert • Example: game playing where expert = brute-force search in simulation mode • But also structured prediction

Sequence labeling via imitation learning • What is the “expert” here? • Given a loss function (e.g., Hamming loss) • Expert takes action that minimizes long-term loss Loss of best reachable Output prefix output starting with at time t prefix 𝑧 ∘ 𝑏 • When expert can be computed exactly, it is called an oracle • Key advantages • Can define features • No restriction to Markov features

T oday’s topics • Improving on gold parse oracle • Research highlight: [Goldberg & Nivre, 2012] • Imitation learning for structured prediction • CIML ch 18

From Dependency Parsing to Imitation Learning CMSC 723 / LING 723 / - PowerPoint PPT Presentation

From Dependency Parsing to Imitation Learning CMSC 723 / LING 723 / INST 725 Marine Carpuat Fig credits: Joakim Nivre, Yoav Goldberg, Hal Daume III T odays topics: Addressing compounding error Improving on gold parse oracle

Robust Incremental Neural Semantic Graph Parsing Jan Buys and Phil Blunsom Dependency Parsing vs

Graph Based Dependency Parsing Wei Qiu December 15, 2011 . . . . . . Graph Based

Dependency Parsing II CMSC 470 Marine Carpuat Graph-based Dependency Parsing Slides credit:

Introduction to Bottom-Up Parsing Shift-reduce parsing The LR parsing algorithm

Natural Language Processing Other Syntactic Models Parsing IV Dan Klein UC Berkeley Dependency

Dependency Parsing CMSC 723 / LING 723 / INST 725 Marine Carpuat Fig credits: Joakim Nivre, Dan

Dependency Parsing 2 CMSC 723 / LING 723 / INST 725 Marine Carpuat Fig credits: Joakim Nivre,

Dependency Parsing & Feature-based Parsing Ling571 Deep Processing Techniques for NLP

Marina Valeeva Outline 2 1. Introduction What is Dependency Parsing? What is a

Statistical Parsing Dependency parsing ar ltekin University of Tbingen Seminar fr

CSC 4181 Compiler Construction Parsing 1 1 Outline Top-down v.s. Bottom-up Top-down parsing

Lecture 19: Dependency Grammars and Dependency Parsing Julia Hockenmaier juliahmr@illinois.edu

Dependency Grammars and Parsing CMSC 473/673 UMBC Outline Review: PCFGs and CKY Dependency

Thoughts on Learner Data and Motivation Learner Language Dependency Parsing and Dependency

A Fast and Accurate Dependency Parser using Neural Networks Danqi Chen & Christopher D.

NLP Programming Tutorial 12 - Dependency Parsing Graham Neubig Nara Institute of Science and

Aerial distributed beamforming < 2 > Beamforming concept < 3 > The transmitters

Pegs and Pain Stephanie Schmitt-Groh e Mart n Uribe Columbia University March 12,

The Gold Grabbing Game Gold Grabbing on Paths Gold Grabbing on Trees Deborah E. Seacrest

Sequence-to-Sequence Learning as Beam-Search Optimization Sam Wiseman and Alexander M. Rush

Volusia County, Florida Volusia County recognizes the importance of supporting our local

(De)Stabilizing the ACAs Insurance Market Mark A. Hall Wake Forest University Brookings

UNIVERSITY OF CALIFORNIA Economics 134 DEPARTMENT OF ECONOMICS Spring 2018 Professor David

Reex amining Direct Cache Access to Optimize I/O Intensive Applications for Multi- hundred-

From Dependency Parsing to Imitation Learning CMSC 723 / LING 723 / - PowerPoint PPT Presentation

From Dependency Parsing to Imitation Learning CMSC 723 / LING 723 / INST 725 Marine Carpuat Fig credits: Joakim Nivre, Yoav Goldberg, Hal Daume III T odays topics: Addressing compounding error Improving on gold parse oracle

Robust Incremental Neural Semantic Graph Parsing Jan Buys and Phil Blunsom Dependency Parsing vs

Graph Based Dependency Parsing Wei Qiu December 15, 2011 . . . . . . Graph Based

Dependency Parsing II CMSC 470 Marine Carpuat Graph-based Dependency Parsing Slides credit:

Introduction to Bottom-Up Parsing Shift-reduce parsing The LR parsing algorithm

Natural Language Processing Other Syntactic Models Parsing IV Dan Klein UC Berkeley Dependency

Dependency Parsing CMSC 723 / LING 723 / INST 725 Marine Carpuat Fig credits: Joakim Nivre, Dan

Dependency Parsing 2 CMSC 723 / LING 723 / INST 725 Marine Carpuat Fig credits: Joakim Nivre,

Dependency Parsing &amp; Feature-based Parsing Ling571 Deep Processing Techniques for NLP

Marina Valeeva Outline 2 1. Introduction What is Dependency Parsing? What is a

Statistical Parsing Dependency parsing ar ltekin University of Tbingen Seminar fr

CSC 4181 Compiler Construction Parsing 1 1 Outline Top-down v.s. Bottom-up Top-down parsing

Lecture 19: Dependency Grammars and Dependency Parsing Julia Hockenmaier juliahmr@illinois.edu

Dependency Grammars and Parsing CMSC 473/673 UMBC Outline Review: PCFGs and CKY Dependency

Thoughts on Learner Data and Motivation Learner Language Dependency Parsing and Dependency

A Fast and Accurate Dependency Parser using Neural Networks Danqi Chen &amp; Christopher D.

NLP Programming Tutorial 12 - Dependency Parsing Graham Neubig Nara Institute of Science and

Aerial distributed beamforming &lt; 2 &gt; Beamforming concept &lt; 3 &gt; The transmitters

Pegs and Pain Stephanie Schmitt-Groh e Mart n Uribe Columbia University March 12,

The Gold Grabbing Game Gold Grabbing on Paths Gold Grabbing on Trees Deborah E. Seacrest

Sequence-to-Sequence Learning as Beam-Search Optimization Sam Wiseman and Alexander M. Rush

Volusia County, Florida Volusia County recognizes the importance of supporting our local

(De)Stabilizing the ACAs Insurance Market Mark A. Hall Wake Forest University Brookings

UNIVERSITY OF CALIFORNIA Economics 134 DEPARTMENT OF ECONOMICS Spring 2018 Professor David

Reex amining Direct Cache Access to Optimize I/O Intensive Applications for Multi- hundred-

Dependency Parsing & Feature-based Parsing Ling571 Deep Processing Techniques for NLP

A Fast and Accurate Dependency Parser using Neural Networks Danqi Chen & Christopher D.

Aerial distributed beamforming < 2 > Beamforming concept < 3 > The transmitters