Dependency Grammars Dependency grammars . ltekin, SfS / University - - PDF document

dependency grammars
SMART_READER_LITE
LIVE PREVIEW

Dependency Grammars Dependency grammars . ltekin, SfS / University - - PDF document

Dependency Grammars Dependency grammars . ltekin, SfS / University of Tbingen WS 1920 5 / 27 Where were we? Constituency overview Closing remarks (also in computer science) An example: constituency grammar in action Grammar S


slide-1
SLIDE 1

Dependency Grammars

Data structures and algorithms for Computational Linguistics III Çağrı Çöltekin ccoltekin@sfs.uni-tuebingen.de

University of Tübingen Seminar für Sprachwissenschaft

Winter Semester 2019–2020

Where were we? Constituency overview Dependency grammars Closing remarks

So far …

(second part of the course)

  • Preliminaries: (formal) languages, grammars and

automata

– Chomsky hierarchy of language classes – Expressivity and computational complexity – Learnability

  • Finite state automata, regular languages, regular

grammars and regular expressions

– DFA, NFA, determinization – Closure properties of regular languages – Minimization

  • Finite state transducers and their applications in CL
  • Constituency parsing (CKY, Earley)

Ç. Çöltekin, SfS / University of Tübingen WS 19–20 1 / 27 Where were we? Constituency overview Dependency grammars Closing remarks

Next …

  • Dependency grammars, and dependency treebanks
  • Dependency parsing

– Transition based dependency parsing (with a short introduction to classifjcation) – Graph based dependency parsing

Ç. Çöltekin, SfS / University of Tübingen WS 19–20 2 / 27 Where were we? Constituency overview Dependency grammars Closing remarks

Why do we need syntactic parsing?

S NP John VP V saw NP Mary S NP Mary VP V saw NP John

  • Syntactic analysis is an intermediate step in (semantic)

interpretation of sentences

  • It is essential for understanding and generating natural

language sentences (hence, also useful for applications like question answering, information extraction, …)

  • (Statistical) parsers are also used as language models for

applications like speech recognition and machine translation

  • It can be used for grammar checking, and can be a useful tool

for linguistic research

Ç. Çöltekin, SfS / University of Tübingen WS 19–20 3 / 27 Where were we? Constituency overview Dependency grammars Closing remarks

Ingredients of a parser

  • A grammar
  • An algorithm for parsing
  • A method for ambiguity resolution

Ç. Çöltekin, SfS / University of Tübingen WS 19–20 4 / 27 Where were we? Constituency overview Dependency grammars Closing remarks

Phrase structure (or constituency) grammars

The main idea is that a span of words form a natural unit, called a constituent or phrase.

  • Constituency grammars are common in modern linguistics

(also in computer science)

  • Most are based on a context-free ‘backbone’, extensions or

restricted forms are common

Ç. Çöltekin, SfS / University of Tübingen WS 19–20 5 / 27 Where were we? Constituency overview Dependency grammars Closing remarks

An example: constituency grammar in action

Grammar S → NP VP VP → V NP NP → John | Mary V → saw Parse tree

S NP John VP V saw NP Mary

Derivations S ⇒ NP VP ⇒ John VP ⇒ John V NP ⇒ John saw NP ⇒ John saw Mary

  • r, S

⇒John saw Mary

Ç. Çöltekin, SfS / University of Tübingen WS 19–20 6 / 27 Where were we? Constituency overview Dependency grammars Closing remarks

An exercise

  • Write down simple (phrase structure) grammar rules for

parsing the sentence I read a good book during the break and construct the parse tree

  • Repeat the same for a (more-or-less direct) translation of

the same sentence in another language

  • How about the following sentence?

During the break, I read a good book

Ç. Çöltekin, SfS / University of Tübingen WS 19–20 7 / 27

slide-2
SLIDE 2

Where were we? Constituency overview Dependency grammars Closing remarks

Where do grammars come from?

  • Grammars for (constituency) parsing can be either

– hand crafted (many years of expert efgort) – extracted from treebanks (which also require lots of efgort) – ‘induced’ from raw data (interesting, but not as successful)

  • Current practice relies mostly on treebanks
  • Hybrid approaches also exist
  • Grammar induction is not common (for practical models),

but exploiting unlabeled data for improving parsing is also a common trend

Ç. Çöltekin, SfS / University of Tübingen WS 19–20 8 / 27 Where were we? Constituency overview Dependency grammars Closing remarks

Dependency grammars

introduction

  • Dependency grammars gained popularity in linguistics

(particularly in CL) rather recently

  • They are old: roots can be traced back to Pāṇini (approx.

5th century BCE)

  • Modern dependency grammars are often attributed to

Tesnière 1959

  • The main idea is capturing the relations between words,

rather than grouping them into (abstract) constituents John saw Mary

subject

  • bject

root

Ç. Çöltekin, SfS / University of Tübingen WS 19–20 9 / 27 Where were we? Constituency overview Dependency grammars Closing remarks

Dependency grammars

John saw Mary

subject

  • bject

root

  • No constituents, units of syntactic structure are words
  • The structure of the sentence is represented by asymmetric,

binary relations between syntactic units

  • Each relation defjnes one of the words as the head and the
  • ther as dependent
  • Typically, the links (relations) have labels (dependency

types)

  • Often an artifjcial root node is used for computational

convenience

Ç. Çöltekin, SfS / University of Tübingen WS 19–20 10 / 27 Where were we? Constituency overview Dependency grammars Closing remarks

A more realistic example

I really enjoyed reading it .

nsubj advmod root xcomp

  • bj

punct

Ç. Çöltekin, SfS / University of Tübingen WS 19–20 11 / 27 Where were we? Constituency overview Dependency grammars Closing remarks

Dependency grammars: alternative notation

I saw her duck root

subj

  • bj

nmod

pron verb pron noun root I saw her duck

subj

  • bj

nmod

Ç. Çöltekin, SfS / University of Tübingen WS 19–20 12 / 27 Where were we? Constituency overview Dependency grammars Closing remarks

Dependency grammar: defjnition

A dependency grammar is a tuple (V, A) V is a set of nodes corresponding to the (syntactic) words (we implicitly assume that words have indexes) A is a set of arcs of the form (wi, r, wj) where

wi ∈ V is the head r is the type of the relation (arc label) wj ∈ V is the dependent

This defjnes a directed graph.

Ç. Çöltekin, SfS / University of Tübingen WS 19–20 13 / 27 Where were we? Constituency overview Dependency grammars Closing remarks

Dependency grammars: common assumptions

  • Every word has a single head
  • The dependency graphs are acyclic
  • The graph is connected
  • With these assumptions, the representation is a tree
  • Note that these assumptions are not universal but common

for dependency parsing

Ç. Çöltekin, SfS / University of Tübingen WS 19–20 14 / 27 Where were we? Constituency overview Dependency grammars Closing remarks

How to determine heads

  • 1. Head (H) determines the syntactic category of the

construction (C) and can often replace C

  • 2. H determines the semantic category of C; the dependent

(D) gives semantic specifjcation

  • 3. H is obligatory, D may be optional
  • 4. H selects D and determines whether D is obligatory or
  • ptional
  • 5. The form and/or position of dependent is determined by

the head

  • 6. The form of D depends on H
  • 7. The linear position of D is specifjed with reference to H

(from Kübler, McDonald, and Nivre 2009, p.3–4) Ç. Çöltekin, SfS / University of Tübingen WS 19–20 15 / 27

slide-3
SLIDE 3

Where were we? Constituency overview Dependency grammars Closing remarks

Issues with head assignment and dependency labels

  • Determining heads are not always straightforward
  • A construction is called endocentric if the head can replace

the whole construction, exocentric otherwise

syntactic parsing

amod

saw Mary

  • bj
  • It is often unclear whether dependency labels encode

syntactic or semantic functions

Ç. Çöltekin, SfS / University of Tübingen WS 19–20 16 / 27 Where were we? Constituency overview Dependency grammars Closing remarks

Some tricky constructions

Coordination

John and Mary work

subj cc conj

John and Mary work

subj cc conj

John and Mary work

subj conj conj

Ç. Çöltekin, SfS / University of Tübingen WS 19–20 17 / 27 Where were we? Constituency overview Dependency grammars Closing remarks

Some tricky constructions

Adpositional phrases

…works from home

vcompl pcompl

…works from home

nmod case

Ç. Çöltekin, SfS / University of Tübingen WS 19–20 18 / 27 Where were we? Constituency overview Dependency grammars Closing remarks

Some tricky constructions

Subordinate clauses

think that they can…

  • bj

sbar subj

think that they can…

  • bj

mark subj

Ç. Çöltekin, SfS / University of Tübingen WS 19–20 19 / 27 Where were we? Constituency overview Dependency grammars Closing remarks

Some tricky constructions

Auxiliaries vs. main verbs

…will work

root aux

…will work

root aux

Ç. Çöltekin, SfS / University of Tübingen WS 19–20 20 / 27 Where were we? Constituency overview Dependency grammars Closing remarks

Dependency grammars: projectivity

A hearing is scheduled

  • n

the issue today .

ROOT VC PUNC SBJ NMOD PP TMP NP NMOD

  • If a dependency graph has no crossing edges, it is said to

be projective, otherwise non-projective

  • Non-projectivity stems from long-distance dependencies

and free word order

  • Projective dependency trees can be represented with

context-free grammars

  • In general, projective dependencies are parseable more

effjciently

Ç. Çöltekin, SfS / University of Tübingen WS 19–20 21 / 27 Where were we? Constituency overview Dependency grammars Closing remarks

CONLL-X/U format for dependency annotation

Single-head assumption allows fmat representation of dependency trees

✞ ☎

1 Read read VERB VB Mood=Imp|VerbForm=Fin 0 root 2

  • n
  • n

ADV RB _ 1 advmod 3 to to PART TO _ 4 mark 4 learn learn VERB VB VerbForm=Inf 1 xcomp 5 the the DET DT Definite=Def 6 det 6 facts fact NOUN NNS Number=Plur 4 obj 7 . . PUNCT . _ 1 punct

✝ ✆

Read

  • n

to learn the facts .

advmod mark xcomp det

  • bj

punct

example from English Universal Dependencies treebank Ç. Çöltekin, SfS / University of Tübingen WS 19–20 22 / 27 Where were we? Constituency overview Dependency grammars Closing remarks

Dependency parsing

  • Dependency parsing has many similarities with

context-free parsing (e.g., trees)

  • They also have some difgerent properties (e.g., number of

edges and depth of trees are limited)

  • Dependency parsing can be

– grammar-driven (hand crafted rules or constraints) – data-driven (rules/model is learned from a treebank)

  • There are two main approaches:

Graph-based similar to context-free parsing, search for the best tree structure Transition-based similar to shift-reduce parsing (used for programming language parsing), but using greedy search for the best transition sequence

Ç. Çöltekin, SfS / University of Tübingen WS 19–20 23 / 27

slide-4
SLIDE 4

Where were we? Constituency overview Dependency grammars Closing remarks

Grammar-driven dependency parsing

  • Grammar-driven dependency parsers typically based on

– lexicalized CF parsing – constraint satisfaction problem

  • start from fully connected graph, eliminate trees that do not

satisfy the constraints

  • exact solution is intractable, often employ heuristics,

approximate methods

  • sometimes ‘soft’, or weighted, constraints are used

– Practical implementations exist

  • Our focus will be on data-driven methods

Ç. Çöltekin, SfS / University of Tübingen WS 19–20 24 / 27 Where were we? Constituency overview Dependency grammars Closing remarks

Dependency grammars

Advantages and disadvantages

+ Close relation to semantics + Easier for fmexible/free word order + Lots, lots of (multi-lingual) computational work, resources + Often much useful in downstream tasks + More effjcient parsing algorithms − No distinction between modifjcation of head or the whole ‘constituent’ − Some structures are diffjcult to capture, e.g., coordination

Ç. Çöltekin, SfS / University of Tübingen WS 19–20 25 / 27 Where were we? Constituency overview Dependency grammars Closing remarks

Summary

  • Dependency grammars are based on asymmetric, binary

relations between syntactic units

  • Dependencies are (often) labeled
  • Dependency analyses are used more in downstream tasks

Next:

  • A hands-on introduction to Universal Dependencies
  • Dependency parsing

– Transition based – Graph based

Ç. Çöltekin, SfS / University of Tübingen WS 19–20 26 / 27 Where were we? Constituency overview Dependency grammars Closing remarks

A familiar exercise

  • Construct a dependency tree for the sentence

I read a good book during the break

  • Repeat the same for a (more-or-less direct) translation of

the same sentence in another language

  • How about the following sentence?

During the break, I read a good book

Ç. Çöltekin, SfS / University of Tübingen WS 19–20 27 / 27

References / additional reading material

  • Kübler, McDonald, and Nivre (2009, Chapters 1&2)
  • The new version of Jurafsky and Martin (2009) also

includes a draft chapter on dependency grammars and dependency parsing

  • Universal Dependencies web site contains a wide range of

information and examples. The tutorial slides at http://universaldependencies.org/eacl17tutorial/ is a good starting point.

Ç. Çöltekin, SfS / University of Tübingen WS 19–20 A.1

References / additional reading material (cont.)

Jurafsky, Daniel and James H. Martin (2009). Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech

  • Recognition. second. Pearson Prentice Hall. isbn: 978-0-13-504196-3.

Kübler, Sandra, Ryan McDonald, and Joakim Nivre (2009). Dependency Parsing. Synthesis lectures on human language technologies. Morgan & Claypool. isbn: 9781598295962. Tesnière, Lucien (1959). Éléments de syntaxe structurale. Paris: Éditions Klinksieck.

Ç. Çöltekin, SfS / University of Tübingen WS 19–20 A.2