December 6 th , 2017 Adapted from Stanford CS124U Outline - PowerPoint PPT Presentation

DATA130006 Text Management and Analysis Dependency Parsing 魏忠钰复旦大学大数据学院 School of Data Science, Fudan University December 6 th , 2017 Adapted from Stanford CS124U

Outline § Introduction

Dependency Grammar and Dependency Structure Dependency syntax postulates that syntactic structure consists of lexical items linked by binary asymmetric relations (“arrows”) called dependencies The arrows are submitted prep commonly typed nsubjpass auxpass with the name of Bills by were prep pobj grammatical on Brownback relations (subject, pobj appos nn prepositional object, ports Senator Republican cc apposition, etc.) conj prep and immigration of pobj Kansas

Dependency Grammar and Dependency Structure Dependency syntax postulates that syntactic structure consists of lexical items linked by binary asymmetric relations (“arrows”) called dependencies The arrow connects a submitted head (governor, prep nsubjpass auxpass superior, regent) with a dependent (modifier, Bills by were prep inferior, subordinate) pobj on Brownback pobj appos nn Usually, dependencies ports Senator Republican form a tree (connected, cc conj prep acyclic, single-head) and immigration of pobj Kansas

Relation between phrase structure and dependency structure § A dependency grammar has a notion of a head. Officially, CFGs don’t. § But modern linguistic theory and all modern statistical parsers (Charniak, Collins, Stanford, …) do, via hand-written phrasal “head rules”: § The head of a Noun Phrase is a noun/number/adj/… § The head of a Verb Phrase is a verb/modal/…. § The head rules can be used to extract a dependency parse from a CFG parse

Methods of Dependency Parsing § Dynamic programming (like in the CKY algorithm) You can do it similarly to lexicalized PCFG parsing: an O(n 5 ) algorithm § Eisner (1996) gives a clever algorithm that reduces the complexity to O(n 3 ), by § producing parse items with heads at the ends rather than in the middle § Graph algorithms You create a Maximum Spanning Tree for a sentence § McDonald et al.’s (2005) MSTParser scores dependencies independently using § a ML classifier (he uses MIRA, for online learning, but it could be MaxEnt) § Constraint Satisfaction Edges are eliminated that don’t satisfy hard constraints. Karlsson (1990), etc. § § “Deterministic parsing” Greedy choice of attachments guided by machine learning classifiers § MaltParser (Nivre et al. 2008) §

Dependency Conditioning Preferences What are the sources of information for dependency parsing? 1. Bilexical affinities [issues à the] is plausible 2. Dependency distance mostly with nearby words 3. Intervening material Dependencies rarely span intervening verbs or punctuation 4. Valency of heads How many dependents on which side are usual for a head? ROOT Discussion of the outstanding issues was completed .

Outline § Introduction § Greedy Transition-Based Parsing

MaltParser [Nivre et al. 2008] § A simple form of greedy discriminative dependency parser § The parser does a sequence of bottom up actions § Roughly like “shift” or “reduce” in a shift-reduce parser, but the “reduce” actions are specialized to create dependencies with head on left or right § The parser has: § a stack σ, written with top to the right § which starts with the ROOT symbol § a buffer β, written with top to the left § which starts with the input sentence § a set of dependency arcs A § which starts off empty § a set of actions

Basic transition-based dependency parser Start: σ = [ROOT], β = w 1 , …, w n , A = ∅ 1. Shift σ, w i |β, A è σ| w i , β, A 2. Left-Arc r σ| w i , w j |β, A è σ, w j |β, A ∪ { r ( w j , w i )} 3. Right-Arc r σ| w i , w j |β, A è σ, w i |β, A ∪ { r ( w i , w j )} Finish: β = ∅

Actions (“arc-eager” dependency parser) Start: σ = [ROOT], β = w 1 , …, w n , A = ∅ 1. Left-Arc r σ| w i , w j |β, A è σ, w j |β, A ∪ { r ( w j , w i )} Precondition: r’ ( w k , w i ) ∉ A, w i ≠ ROOT 2. Right-Arc r σ| w i , w j |β, A è σ| w i | w j , β, A ∪ { r ( w i , w j )} 3. Reduce σ| w i , β, A è σ, β, A Precondition: r’ ( w k , w i ) ∈ A 4. Shift σ, w i |β, A è σ| w i , β, A Finish: β = ∅ This is the common “arc-eager” variant: a head can immediately take a right dependent, before its dependents are found

Example 1. Left-Arc r σ| w i , w j |β, A è σ, w j |β, A ∪ { r ( w j , w i )} Precondition: ( w k , r’ , w i ) ∉ A, w i ≠ ROOT 2. Right-Arc r σ| w i , w j |β, A è σ| w i | w j , β, A ∪ { r ( w i , w j )} 3. Reduce σ| w i , β, A è σ, β, A Precondition: ( w k , r’ , w i ) ∈ A 4. Shift σ, w i |β, A è σ| w i , β, A Happy children like to play with their friends . [ROOT] [Happy, children, …] ∅ Shift [ROOT, Happy] [children, like, …] ∅ LA amod [ROOT] [children, like, …] {amod(children, happy)} = A 1 Shift [ROOT, children] [like, to, …] A 1 LA nsubj [ROOT] [like, to, …] A 1 ∪ {nsubj(like, children)} = A 2 RA root [ROOT, like] [to, play, …] A 2 ∪ {root(ROOT, like) = A 3 Shift [ROOT, like, to] [play, with, …] A 3 LA aux [ROOT, like] [play, with, …] A 3 ∪ {aux(play, to) = A 4 RA xcomp [ROOT, like, play] [with their, …] A 4 ∪ {xcomp(like, play) = A 5

Example 1. Left-Arc r σ| w i , w j |β, A è σ, w j |β, A ∪ { r ( w j , w i )} Precondition: ( w k , r’ , w i ) ∉ A, w i ≠ ROOT 2. Right-Arc r σ| w i , w j |β, A è σ| w i | w j , β, A ∪ { r ( w i , w j )} 3. Reduce σ| w i , β, A è σ, β, A Precondition: ( w k , r’ , w i ) ∈ A 4. Shift σ, w i |β, A è σ| w i , β, A Happy children like to play with their friends . RA xcomp [ROOT, like, play] [with their, …] A 4 ∪ {xcomp(like, play) = A 5 RA prep [ROOT, like, play, with] [their, friends, …] A 5 ∪ {prep(play, with) = A 6 Shift [ROOT, like, play, with, their] [friends, .] A 6 LA poss [ROOT, like, play, with] [friends, .] A 6 ∪ {poss(friends, their) = A 7 RA pobj [ROOT, like, play, with, friends] [.] A 7 ∪ {pobj(with, friends) = A 8 Reduce [ROOT, like, play, with] [.] A 8 Reduce [ROOT, like, play] [.] A 8 Reduce [ROOT, like] [.] A 8 RA punc [ROOT, like, .] [] A 8 ∪ {punc(like, .) = A 9 You terminate as soon as the buffer is empty. Dependencies = A 9

MaltParser [Nivre et al. 2008] § We have left to explain how we choose the next action § Each action is predicted by a discriminative classifier (often SVM, could be maxent classifier) over each legal move § Max of 4 untyped choices, max of |R| × 2 + 2 when typed § Features: top of stack word, POS; first in buffer word, POS; etc. § There is NO search (in the simplest and usual form) § But you could do some kind of beam search if you wish § The model’s accuracy is slightly below the best LPCFGs (evaluated on dependencies), but § It provides close to state of the art parsing performance § It provides VERY fast linear time parsing

Evaluation of Dependency Parsing: (labeled) dependency accuracy Acc = # correct deps # of deps UAS = 4 / 5 = 80% LAS = 2 / 5 = 40% ROOT She saw the video lecture 0 1 2 3 4 5 Gold Parsed 1 2 She nsubj 1 2 She nsubj 2 0 saw root 2 0 saw root 3 5 the det 3 4 the det 4 5 video nn 4 5 video nsubj 5 2 lecture dobj 5 2 lecture ccomp

Representative performance numbers § The CoNLL-X (2006) shared task provides evaluation numbers for various dependency parsing approaches over 13 languages § MALT: LAS scores from 65–92%, depending greatly on language/treebank § Here we give a few UAS numbers for English to allow some comparison to constituency parsing Parser UAS% Sagae and Lavie (2006) ensemble of dependency parsers 92.7 Charniak (2000) generative, constituency 92.2 Collins (1999) generative, constituency 91.7 McDonald and Pereira (2005) – MST graph-based dependency 91.5 Yamada and Matsumoto (2003) – transition-based dependency 90.4

Projectivity § Dependencies from a CFG tree using heads, must be projective § There must not be any crossing dependency arcs when the words are laid out in their linear order, with all arcs above the words. § But dependency theory normally does allow nonprojective structures to account for displaced constituents § You can’t easily get the semantics of certain constructions right without these nonprojective dependencies Who did Bill buy the coffee from yesterday ?

Handling non-projectivity • The arc-eager algorithm we presented only builds projective dependency trees • Possible directions to head: 1. Just declare defeat on nonprojective arcs 2. Use a dependency formalism which only admits projective representations (a CFG doesn’t represent such structures…) 3. Use a postprocessor to a projective dependency parsing algorithm to identify and resolve nonprojective links 4. Add extra types of transitions that can model at least most non-projective structures 5. Move to a parsing mechanism that does not use or require any constraints on projectivity (e.g., the graph-based MSTParser)

Outline § Introduction § Greedy Transition-Based Parsing § Relation Extraction with Stanford Dependencies

December 6 th , 2017 Adapted from Stanford CS124U Outline - PowerPoint PPT Presentation

DATA130006 Text Management and Analysis Dependency Parsing School of Data Science, Fudan University December 6 th , 2017 Adapted from Stanford CS124U Outline Introduction Dependency Grammar and

Sunday, December 16, 12 Sunday, December 16, 12 Sunday, December 16, 12 CLUB BAR TAVERN

Sunday, December 16, 12 Sunday, December 16, 12 Sunday, December 16, 12 3 Sunday, December 16,

5-6 December, 2019 | RAI, Amsterdam 5-6 December, 2019 | RAI, Amsterdam 5-6 December, 2019 | RAI,

Sunday, December 16, 12 Sunday, December 16, 12 Sunday, December 16, 12 Sunday, December 16, 12

St Aloysius Religious Education December 17, 2017 8:55am 9:35am Sunday December 18, 2017

PRESENTATION OF CREDENTIALS IN 2017 Permanent Representatives December 2017 19 December

Tucson Fire Department 2016 Awards Presentation September 22, 2016: December 1, 2016: December 15,

Sound Thursday, 8 December 11 CD quality 44.1 kHz, 16-bit, stereo Thursday, 8 December 11

ANNUAL POST-AUDIT PRESENTATION (FOR THE AUDIT YEAR ENDED DECEMBER 31, 2016) DECEMBER 2017 Annual

SBF AGM 2017 CEO Slides SBF AGM 2017 CEO Slides SBF AGM 2017 CEO Slides SBF AGM 2017 CEO Slides

Participation in Aneel Auctions December 2017 December 22, 2017 TRANSMISSION AUCTION 02/2017

SCHEDULE 2017 2018 2019 SEPTEMBER SEPTEMBER NOVEMBER NOVEMBER DECEMBER DECEMBER DECEMBER

The Operation Impact of Continuous Deployment Monday, December 12, 11 DevOps (a quick

3/19/2017 Resource Aquisition And Transport in Vascular Plants 1 3/19/2017 2 3/19/2017 3

3/3/2017 Rick Guidotti 1 3/3/2017 2 3/3/2017 ALBINISM 3 3/3/2017 4 3/3/2017 5

Investor Presentation December 2017 December 2017 www.amwater.com 1 NYSE: AWK Forward-Looking

Dependency Parse Dependency Tags aux auxiliary auxpass passive auxiliary cop

Gestures, Demonstratives, and the Attributive/Referential Distinction Cornelia Ebert Christian

Ephesians Series Lesson #025 May 5, 2019 Dean Bible Ministries www.deanbibleministries.org Dr.

Constrained decoding for text-level discourse parsing Philippe Muller 1 Stergos Afantenos 1 Pascal

Witnessable Quantifiers License Type-e Meaning Evidence from Contrastive Topic, Equatives and

Tutorial on Abstractive Text Summarization Advaith Siddharthan NLG Summer School, Aberdeen, 22

Dependency Grammars and Parser LING 571 Deep Processing for NLP October 16, 2019 Shane

Patterns of relativization in Austronesian and Tibetan Michael Yoshitaka ERLEWiNE (mitcho)