Deep Dependency Graph Conversion in English 15th International - - PowerPoint PPT Presentation

deep dependency graph conversion in english
SMART_READER_LITE
LIVE PREVIEW

Deep Dependency Graph Conversion in English 15th International - - PowerPoint PPT Presentation

Deep Dependency Graph Conversion in English 15th International Workshop on Treebanks and Linguistic Theories January 20th, 2017 Jinho D. Choi Why Dependency Structure? Many robust and scalable dependency parsers are available. Parser


slide-1
SLIDE 1

Deep Dependency Graph Conversion in English

15th International Workshop on
 Treebanks and Linguistic Theories January 20th, 2017
 Jinho D. Choi

slide-2
SLIDE 2

Why Dependency Structure?

2

Many robust and scalable dependency parsers are available.

Parser Reference Accuracy Tokens / Sec. Yara Rasooli and Tetreault, 2015 89.32 9,838 Stanford Chen and Manning, 2014 89.59 8,602 spaCy Honnibal et al., 2013 90.86 13,963 NLP4J Choi and McCallum, 2013 91.72 10,271

Comparison between greedy dependency parsers on OntoNotes. It Depends: Dependency Parser Comparison Using A Web-based Evaluation Tool Jinho D. Choi, Joel Tetreault, and Amanda Stent, ACL, 2015.

State-of-the-art achieved by a non-greedy parser: 92.50. Parsing the entire English Wikipedia in 60 hours.

slide-3
SLIDE 3

Why Dependency Structure?

3

It is considered “more” universal. http://universaldependencies.org 48+ languages 391K tokens

slide-4
SLIDE 4

Why Dependency Conversion?

4

Most treebanks are annotated with constituency trees in English.

Treebank Trees Tokens OntoNotes 138,566 2,620,495 BOLT 78,734 949,300 English Web 16,622 254,830 QuestionBank 4,000 38,188 THYME 88,893 936,166 SHARP 50,725 499,834 CRAFT 21,710 561,017 MIPACQ 19,141 269,178 GENIA 18,541 Total 436,932 6,129,008

  • vs. 391K tokens in UD

Covers 20+ genres

slide-5
SLIDE 5

Towards Deep Structure

5

Most dependency parsing approaches have focused on tree parsing. CoNLL 2006, 2007, 2008, 2009, 2017 shared tasks. Allows to develop efficient parsing models

PROS

Cannot represent the complete relations

CONS

PropBank Abstract Meaning Representation Semantic Dependency Parsing

http://sdp.delph-in.net http://verbs.colorado.edu/propbank/ http://amr.isi.edu http://nlp.cs.nyu.edu/meyers/NomBank.html

NomBank

slide-6
SLIDE 6

Towards Deep Structure

6

PropBank Abstract Meaning Representation Semantic Dependency Parsing NomBank Focused on
 verbal predicates Focused on
 nominal predicates Added nominal and adjectival predicates Semantically oriented Penn Treebank Syntactically


  • riented

Do not necessarily agree with Treebank Relatively small Limited to one genre

slide-7
SLIDE 7

Deep Dependency Graph

7

Consistent representations regardless of syntactic variations. Rich predicate argument structures. A large number of deep dependency graphs in multiple genres.

Objectives

Dative Expletive Passive Coordination

Arguments

Small Clause Open Clause Relative Clause

Predicates

Secondary Light Verb

Auxiliaries

Modal Raising Verb

slide-8
SLIDE 8

Secondary Predicate

8 Universal Dependency

Secondary predicate → function tag PRD. vs.

slide-9
SLIDE 9

Secondary Predicate

9 Universal Dependency Deep Dependency

slide-10
SLIDE 10

Light Verb Construction

10

Light verbs = {make, take, have, do, give, keep} Eventive nouns are collected from PropBank.

Universal Dependency

slide-11
SLIDE 11

Dative

11

Dative → indirect object, DTV or BNF.

slide-12
SLIDE 12

Expletive

12

Expletive → existential “there” or pleonastic “it”. vs. vs.

Deep Dependency Universal Dependency

part-of-speech EX

slide-13
SLIDE 13

Expletive

13

Expletive → existential “there” or pleonastic “it”.

Universal Dependency Deep Dependency

empty category *EXP*

slide-14
SLIDE 14

Passive Construction

14

ç

Heuristics for LINK-PSV in PropBank Secondary Dependency

slide-15
SLIDE 15

Coordination

15

Arguments in coordination are
 explicitly represented in constituency trees.

slide-16
SLIDE 16

Small Clause

16 Deep Dependency Universal Dependency Deep Dependency Universal Dependency

ç ç ç ç

Small clause → S consisting of only SBJ and PRD.

slide-17
SLIDE 17

Open Clause

17

Open clause → a clause without an internal subject.

ç ç ç ç

slide-18
SLIDE 18

Relative Clause

18

Open clause → empty category *T*.

Heuristics for LINK-SLC in PropBank

?

slide-19
SLIDE 19

Modal Adjective

19 able 915 ready 105 prepared 32 due 24 glad 21 likely 235 happy 69 eager 30 sure 24 unwilling 20 willing 173 about 49 free 30 determined 22 busy 18 unable 165 reluctant 44 unlikely 28 afraid 22 qualified 16

An adjectival predicate including an open clause
 whose external subject is the subject of the adjectival predicate.

slide-20
SLIDE 20

Raising Verb

20

Raising verb → empty category *-d to SBJ.

have 1,846 begin 825 stop 379 keep 158 prove 89 go 1,461 seem 787 be 322 use 157 turn 67 continue 1,210 appear 714 fail 233 get 136 happen 38 need 1,038 start 546 tend 168

  • ught

91 expect 38

slide-21
SLIDE 21

Deep Dependency Labels

21

Subject csbj Clausal subject 5,291 123 expl Expletive 10,808 nsbj Nominal subject 298,418 71,383 Object comp Clausal complement 86,884 105 dat Dative 6,763 87

  • bj

(Direct or preposition) object 205,149 20,785 Auxiliary aux Auxiliary verb 148,829 cop Copula 81,661 lv Light verb 7,655 modal Modal (verb or adjective) 49,259 raise Raising verb 10,598 acl Clausal modifier of nominal 24,791 7 appo Apposition 32,460 17 Nominal attr Attribute 352,939 14 and det Determiner 334,784 Quantifier num Numeric modifier 95,957 poss Possessive modifier 62,489 relcl Relative clause 35,371 Adverbial adv Adverbial 156,473 7,736 advcl Adverbial clause 49,503 1,750 advnp Adverbial noun phrase 73,026 480 neg Negation 26,373 1,037 ppmod Preposition phrase 371,927 4,471 Particle case Case marker 420,045 mark Clausal marker 47,286 prt Verb particle 13,078 Coordination cc Coordinating conjunction 131,622 conj Conjunct 137,128 com Compound word 270,326

slide-22
SLIDE 22

Conclusion

22

Consistent

Contributions

Development of graph parsing models. Logic representation. Generate a large and diverse corpus of deep dependency graphs. 6M+ tokens 20+ genres Rich

Future Work

Integration with PropBank.