Natural Language Processing (CSEP 517): Dependency Syntax and - PowerPoint PPT Presentation

Natural Language Processing (CSEP 517): Dependency Syntax and Parsing Noah Smith � 2017 c University of Washington nasmith@cs.washington.edu May 1, 2017 1 / 96

To-Do List ◮ Online quiz: due Sunday ◮ Read: K¨ ubler et al. (2009, ch. 1, 2, 6) ◮ A3 due May 7 (Sunday) ◮ A4 due May 14 (Sunday) 2 / 96

Dependencies Informally, you can think of dependency structures as a transformation of phrase-structures that ◮ maintains the word-to-word relationships induced by lexicalization, ◮ adds labels to them, and ◮ eliminates the phrase categories. There are also linguistic theories built on dependencies (Tesni` ere, 1959; Mel’ˇ cuk, 1987), as well as treebanks corresponding to those. ◮ Free(r)-word order languages (e.g., Czech) 3 / 96

Dependency Tree: Definition Let x = � x 1 , . . . , x n � be a sentence. Add a special root symbol as “ x 0 .” A dependency tree consists of a set of tuples � p, c, ℓ � , where ◮ p ∈ { 0 , . . . , n } is the index of a parent ◮ c ∈ { 1 , . . . , n } is the index of a child ◮ ℓ ∈ L is a label Different annotation schemes define different label sets L , and different constraints on the set of tuples. Most commonly: ◮ The tuple is represented as a directed edge from x p to x c with label ℓ . ◮ The directed edges form an arborescence (directed tree) with x 0 as the root (sometimes denoted root ). 4 / 96

Example S NP VP Pronoun Verb NP we wash Determiner Noun our cats Phrase-structure tree. 5 / 96

Example S NP VP Pronoun NP Verb we wash Determiner Noun our cats Phrase-structure tree with heads. 6 / 96

Example S wash NP we VP wash Pronoun we NP cats Verb wash we wash Determiner our Noun cats our cats Phrase-structure tree with heads, lexicalized. 7 / 96

Example we wash our cats “Bare bones” dependency tree. 8 / 96

Example we wash our cats who stink 9 / 96

Example we vigorously wash our cats who stink 10 / 96

Content Heads vs. Function Heads Credit: Nathan Schneider little kids were always watching birds with fish little kids were always watching birds with fish 11 / 96

Labels root pobj sbj dobj prep kids saw birds with fish Key dependency relations captured in the labels include: subject, direct object, preposition object, adjectival modifier, adverbial modifier. In this lecture, I will mostly not discuss labels, to keep the algorithms simpler. 12 / 96

Coordination Structures we vigorously wash our cats and dogs who stink The bugbear of dependency syntax. 13 / 96

Example we vigorously wash our cats and dogs who stink Make the first conjunct the head? 14 / 96

Example we vigorously wash our cats and dogs who stink Make the coordinating conjunction the head? 15 / 96

Example we vigorously wash our cats and dogs who stink Make the second conjunct the head? 16 / 96

Dependency Schemes ◮ Transform the treebank: define “head rules” that can select the head child of any node in a phrase-structure tree and label the dependencies. ◮ More powerful, less local rule sets, possibly collapsing some words into arc labels. ◮ Stanford dependencies are a popular example (de Marneffe et al., 2006). ◮ Direct annotation. 17 / 96

Three Approaches to Dependency Parsing 1. Dynamic programming with the Eisner algorithm. 2. Transition-based parsing with a stack. 3. Chu-Liu-Edmonds algorithm for arborescences. 18 / 96

Dependencies and Grammar Context-free grammars can be used to encode dependency structures. For every head word and constellation of dependent children: N head → N leftmost-sibling . . . N head . . . N rightmost-sibling And for every v ∈ V : N v → v and S → N v . 19 / 96

Dependencies and Grammar Context-free grammars can be used to encode dependency structures. For every head word and constellation of dependent children: N head → N leftmost-sibling . . . N head . . . N rightmost-sibling And for every v ∈ V : N v → v and S → N v . A bilexical dependency grammar binarizes the dependents, generating only one per rule. 20 / 96

Dependencies and Grammar Context-free grammars can be used to encode dependency structures. For every head word and constellation of dependent children: N head → N leftmost-sibling . . . N head . . . N rightmost-sibling And for every v ∈ V : N v → v and S → N v . A bilexical dependency grammar binarizes the dependents, generating only one per rule. Such a grammar can produce only projective trees, which are (informally) trees in which the arcs don’t cross. 21 / 96

Bilexical Dependency Grammar: Derivation S N wash N we N wash we N wash N cats N our N cats wash our cats ıvely, the CKY algorithm will require O ( n 5 ) runtime. Why? Na¨ 22 / 96

CKY for Bilexical Context-Free Grammars N x h N x c N x c N x h i j j + 1 k i j j + 1 k p ( N x h N x c | N x h ) p ( N x c N x h | N x h ) N x h N x h i k i k 23 / 96

CKY Example goal S Nwash Nwash Ncats Nwe Nwash Nour. Ncats we wash our cats 24 / 96

Dependency Parsing with the Eisner Algorithm (Eisner, 1996) h d d h c h h c Items: ◮ Both triangles indicate that x d is a descendant of x h . ◮ Both trapezoids indicate that x c can be attached as the child of x h . ◮ In all cases, the words “in between” are descendants of x h . 25 / 96

Dependency Parsing with the Eisner Algorithm (Eisner, 1996) Initialization: p ( x i | N x i ) 1 i i i i Goal: i n 1 i p ( N x i | S ) goal 26 / 96

Dependency Parsing with the Eisner Algorithm (Eisner, 1996) Attaching a left dependent: Complete a left child: i j j + 1 k i j j k p ( N x i N x k | N x k ) i k i k 27 / 96

Dependency Parsing with the Eisner Algorithm (Eisner, 1996) Attaching a right dependent: Complete a right child: i j j + 1 k i j j k p ( N x i N x k | N x i ) i k i k 28 / 96

Eisner Algorithm Example goal we wash our cats 29 / 96

Three Approaches to Dependency Parsing 1. Dynamic programming with the Eisner algorithm. 2. Transition-based parsing with a stack. 3. Chu-Liu-Edmonds algorithm for arborescences. 30 / 96

Transition-Based Parsing ◮ Process x once, from left to right, making a sequence of greedy parsing decisions. 31 / 96

Transition-Based Parsing ◮ Process x once, from left to right, making a sequence of greedy parsing decisions. ◮ Formally, the parser is a state machine ( not a finite-state machine) whose state is represented by a stack S and a buffer B . 32 / 96

Transition-Based Parsing ◮ Process x once, from left to right, making a sequence of greedy parsing decisions. ◮ Formally, the parser is a state machine ( not a finite-state machine) whose state is represented by a stack S and a buffer B . ◮ Initialize the buffer to contain x and the stack to contain the root symbol. 33 / 96

Transition-Based Parsing ◮ Process x once, from left to right, making a sequence of greedy parsing decisions. ◮ Formally, the parser is a state machine ( not a finite-state machine) whose state is represented by a stack S and a buffer B . ◮ Initialize the buffer to contain x and the stack to contain the root symbol. ◮ The “arc standard” transition set (Nivre, 2004): ◮ shift the word at the front of the buffer B onto the stack S . ◮ right-arc : u = pop ( S ) ; v = pop ( S ) ; push ( S, v → u ) . ◮ left-arc : u = pop ( S ) ; v = pop ( S ) ; push ( S, v ← u ) . (For labeled parsing, add labels to the right-arc and left-arc transitions.) 34 / 96

Transition-Based Parsing ◮ Process x once, from left to right, making a sequence of greedy parsing decisions. ◮ Formally, the parser is a state machine ( not a finite-state machine) whose state is represented by a stack S and a buffer B . ◮ Initialize the buffer to contain x and the stack to contain the root symbol. ◮ The “arc standard” transition set (Nivre, 2004): ◮ shift the word at the front of the buffer B onto the stack S . ◮ right-arc : u = pop ( S ) ; v = pop ( S ) ; push ( S, v → u ) . ◮ left-arc : u = pop ( S ) ; v = pop ( S ) ; push ( S, v ← u ) . (For labeled parsing, add labels to the right-arc and left-arc transitions.) ◮ During parsing, apply a classifier to decide which transition to take next, greedily. No backtracking. 35 / 96

Transition-Based Parsing: Example Buffer B : we vigorously Stack S : wash our root cats who stink Actions: 36 / 96

Transition-Based Parsing: Example Buffer B : vigorously Stack S : wash our we cats root who stink Actions: shift 37 / 96

Transition-Based Parsing: Example Buffer B : Stack S : wash our vigorously cats we who root stink Actions: shift shift 38 / 96

Transition-Based Parsing: Example Buffer B : Stack S : our wash cats vigorously who we stink root Actions: shift shift shift 39 / 96

Transition-Based Parsing: Example Stack S : Buffer B : our cats vigorously wash who stink we root Actions: shift shift shift left-arc 40 / 96

Transition-Based Parsing: Example Stack S : Buffer B : our cats who we vigorously wash stink root Actions: shift shift shift left-arc left-arc 41 / 96

Natural Language Processing (CSEP 517): Dependency Syntax and - PowerPoint PPT Presentation

Natural Language Processing (CSEP 517): Dependency Syntax and Parsing Noah Smith 2017 c University of Washington nasmith@cs.washington.edu May 1, 2017 1 / 96 To-Do List Online quiz: due Sunday Read: K ubler et al. (2009, ch.

Natural Language Processing (CSEP 517): Phrase Structure Syntax and Parsing Noah Smith 2017

CSEP 517 Natural Language Processing Language Models Luke Zettlemoyer Slides adapted from Dan

CSEP 517 Natural Language Processing Introduction Luke Zettlemoyer Slides adapted from Dan

CSEP 517: Natural Language Processing New PMP Course! Instructor: Luke Zettlemoyer Autumn 2013

CSEP 517 Natural Language Processing Autumn 2018 Introduction Luke Zettlemoyer Slides adapted

Natural Language Processing (CSEP 517): Computational Pragmatics Chenhao Tan 2017 c

Natural Language Processing (CSEP 517): Introduction & Language Models Noah Smith c 2017

Natural Language Processing (CSE 517): Dependency Syntax and Parsing Noah A. Smith Swabha

CSE 517 Natural Language Processing Winter 2017 Dependency Parsing And Other Grammar Formalisms

CSEP 517 Natural Language Processing Autumn 2015 Parsing (Trees) Yejin Choi - University of

CSEP 517 Natural Language Processing Frame Semantics Luke Zettlemoyer Slides adapted from Yejin

CSEP 517: Natural Language Processing Recurrent Neural Networks Autumn 2018 Luke Zettlemoyer

CSEP 517 Natural Language Processing Autumn 2015 Introduction Yejin Choi Slides adapted

CSEP 517 Natural Language Processing Luke Zettlemoyer Machine Translation, Sequence-to-sequence

Natural Language Processing (CSEP 517): Distributional Semantics Roy Schwartz 2017 c

Natural Language Processing (CSEP 517): Machine Translation (Continued), Summarization, &

Integrating Heterogeneous and Distributed Information about Marine Species through a Top Level

Outline Scalinga Plenitude of Power Laws Scaling-at-large Scaling-at-large Principles of

Changing Arctic Ocean Implications for marine biology and biogeochemistry Photo by Jen Freer Dr

Spatial-temporal modelling of delta smelt in the San Francisco Estuary Ken Newman US Fish and

DR. KAI TROLL BEST BUDDIES EUROPE, MIDDLE EAST, AFRICA REGION HISTORY EUNICE KENNEDY SHRIVER

Thoughts on Learner Data and Motivation Learner Language Dependency Parsing and Dependency

Moral Preferences F R A N C E S C A R O S S I Decision making Based on our preferences

RESRFP17-1 Proposers Webinar June 14, 2017 2 Agenda Background RESRFP17-1

Natural Language Processing (CSEP 517): Dependency Syntax and - PowerPoint PPT Presentation

Natural Language Processing (CSEP 517): Dependency Syntax and Parsing Noah Smith 2017 c University of Washington nasmith@cs.washington.edu May 1, 2017 1 / 96 To-Do List Online quiz: due Sunday Read: K ubler et al. (2009, ch.

Natural Language Processing (CSEP 517): Phrase Structure Syntax and Parsing Noah Smith 2017

CSEP 517 Natural Language Processing Language Models Luke Zettlemoyer Slides adapted from Dan

CSEP 517 Natural Language Processing Introduction Luke Zettlemoyer Slides adapted from Dan

CSEP 517: Natural Language Processing New PMP Course! Instructor: Luke Zettlemoyer Autumn 2013

CSEP 517 Natural Language Processing Autumn 2018 Introduction Luke Zettlemoyer Slides adapted

Natural Language Processing (CSEP 517): Computational Pragmatics Chenhao Tan 2017 c

Natural Language Processing (CSEP 517): Introduction &amp; Language Models Noah Smith c 2017

Natural Language Processing (CSE 517): Dependency Syntax and Parsing Noah A. Smith Swabha

CSE 517 Natural Language Processing Winter 2017 Dependency Parsing And Other Grammar Formalisms

CSEP 517 Natural Language Processing Autumn 2015 Parsing (Trees) Yejin Choi - University of

CSEP 517 Natural Language Processing Frame Semantics Luke Zettlemoyer Slides adapted from Yejin

CSEP 517: Natural Language Processing Recurrent Neural Networks Autumn 2018 Luke Zettlemoyer

CSEP 517 Natural Language Processing Autumn 2015 Introduction Yejin Choi Slides adapted

CSEP 517 Natural Language Processing Luke Zettlemoyer Machine Translation, Sequence-to-sequence

Natural Language Processing (CSEP 517): Distributional Semantics Roy Schwartz 2017 c

Natural Language Processing (CSEP 517): Machine Translation (Continued), Summarization, &amp;

Integrating Heterogeneous and Distributed Information about Marine Species through a Top Level

Outline Scalinga Plenitude of Power Laws Scaling-at-large Scaling-at-large Principles of

Changing Arctic Ocean Implications for marine biology and biogeochemistry Photo by Jen Freer Dr

Spatial-temporal modelling of delta smelt in the San Francisco Estuary Ken Newman US Fish and

DR. KAI TROLL BEST BUDDIES EUROPE, MIDDLE EAST, AFRICA REGION HISTORY EUNICE KENNEDY SHRIVER

Thoughts on Learner Data and Motivation Learner Language Dependency Parsing and Dependency

Moral Preferences F R A N C E S C A R O S S I Decision making Based on our preferences

RESRFP17-1 Proposers Webinar June 14, 2017 2 Agenda Background RESRFP17-1

Natural Language Processing (CSEP 517): Introduction & Language Models Noah Smith c 2017

Natural Language Processing (CSEP 517): Machine Translation (Continued), Summarization, &