deep dependency graph conversion in english
play

Deep Dependency Graph Conversion in English 15th International - PowerPoint PPT Presentation

Deep Dependency Graph Conversion in English 15th International Workshop on Treebanks and Linguistic Theories January 20th, 2017 Jinho D. Choi Why Dependency Structure? Many robust and scalable dependency parsers are available. Parser


  1. Deep Dependency Graph Conversion in English 15th International Workshop on 
 Treebanks and Linguistic Theories January 20th, 2017 
 Jinho D. Choi

  2. Why Dependency Structure? Many robust and scalable dependency parsers are available. Parser Reference Accuracy Tokens / Sec. Yara Rasooli and Tetreault, 2015 89.32 9,838 Stanford Chen and Manning, 2014 89.59 8,602 spaCy Honnibal et al., 2013 90.86 13,963 NLP4J Choi and McCallum, 2013 91.72 10,271 Comparison between greedy dependency parsers on OntoNotes. It Depends: Dependency Parser Comparison Using A Web-based Evaluation Tool Jinho D. Choi, Joel Tetreault, and Amanda Stent, ACL, 2015. State-of-the-art achieved by a non-greedy parser: 92.50. Parsing the entire English Wikipedia in 60 hours. 2

  3. Why Dependency Structure? It is considered “more” universal. 48+ languages http://universaldependencies.org 391K tokens 3

  4. Why Dependency Conversion? Most treebanks are annotated with constituency trees in English. Treebank Trees Tokens OntoNotes 138,566 2,620,495 BOLT 78,734 949,300 English Web 16,622 254,830 QuestionBank 4,000 38,188 Covers 20+ T HYME 88,893 936,166 genres S HARP 50,725 499,834 C RAFT 21,710 561,017 M IPACQ 19,141 269,178 G ENIA 18,541 Total 436,932 6,129,008 vs. 391K tokens in UD 4

  5. Towards Deep Structure Most dependency parsing approaches have focused on tree parsing. CoNLL 2006, 2007, 2008, 2009, 2017 shared tasks. PROS CONS Allows to develop Cannot represent efficient parsing models the complete relations PropBank NomBank http://verbs.colorado.edu/propbank/ http://nlp.cs.nyu.edu/meyers/NomBank.html Abstract Semantic Meaning Representation Dependency Parsing http://amr.isi.edu http://sdp.delph-in.net 5

  6. Towards Deep Structure Do not necessarily Penn Treebank agree with Treebank PropBank NomBank Syntactically 
 oriented Focused on 
 Focused on 
 verbal predicates nominal predicates Added nominal and adjectival predicates Abstract Semantic Meaning Representation Dependency Parsing Relatively small Limited to one genre Semantically oriented 6

  7. Deep Dependency Graph Objectives A large number of deep dependency graphs in multiple genres. Consistent representations regardless of syntactic variations. Rich predicate argument structures. Predicates Arguments Auxiliaries Secondary Dative Small Clause Modal Light Verb Expletive Open Clause Raising Verb Passive Relative Clause Coordination 7

  8. Secondary Predicate Secondary predicate → function tag PRD . vs. Universal Dependency 8

  9. Secondary Predicate Universal Dependency Deep Dependency 9

  10. Light Verb Construction Light verbs = {make, take, have, do, give, keep} Eventive nouns are collected from PropBank. Universal Dependency 10

  11. Dative Dative → indirect object, DTV or BNF . 11

  12. Expletive Expletive → existential “there” or pleonastic “it”. part-of-speech EX vs. Deep Dependency vs. Universal Dependency 12

  13. Expletive Expletive → existential “there” or pleonastic “it”. empty category *EXP* Deep Dependency Universal Dependency 13

  14. Passive Construction ç Heuristics for LINK-PSV in PropBank Secondary Dependency 14

  15. Coordination Arguments in coordination are 
 explicitly represented in constituency trees. 15

  16. Small Clause Small clause → S consisting of only SBJ and PRD . ç ç Deep Dependency Deep Dependency ç ç Universal Dependency Universal Dependency 16

  17. Open Clause Open clause → a clause without an internal subject. ç ç ç ç 17

  18. Relative Clause Open clause → empty category *T* . Heuristics for LINK-SLC in PropBank ? 18

  19. Modal Adjective An adjectival predicate including an open clause 
 whose external subject is the subject of the adjectival predicate. able 915 ready 105 prepared 32 due 24 glad 21 likely 235 happy 69 eager 30 sure 24 unwilling 20 willing 173 about 49 free 30 determined 22 busy 18 unable 165 reluctant 44 unlikely 28 afraid 22 qualified 16 19

  20. Raising Verb Raising verb → empty category *-d to SBJ . have 1,846 begin 825 stop 379 keep 158 prove 89 go 1,461 seem 787 be 322 use 157 turn 67 continue 1,210 appear 714 fail 233 get 136 happen 38 need 1,038 start 546 tend 168 ought 91 expect 38 20

  21. Deep Dependency Labels Clausal subject 5,291 123 csbj Subject Expletive 10,808 0 expl Nominal subject 298,418 71,383 nsbj Clausal complement 86,884 105 comp Object Dative 6,763 87 dat (Direct or preposition) object 205,149 20,785 obj Auxiliary verb 148,829 0 aux Copula 81,661 0 cop Auxiliary Light verb 7,655 0 lv Modal (verb or adjective) 49,259 0 modal Raising verb 10,598 0 raise Clausal modifier of nominal 24,791 7 acl Apposition 32,460 17 appo Nominal Attribute 352,939 14 attr and Determiner 334,784 0 det Quantifier Numeric modifier 95,957 0 num Possessive modifier 62,489 0 poss Relative clause 35,371 0 relcl Adverbial 156,473 7,736 adv Adverbial clause 49,503 1,750 advcl Adverbial Adverbial noun phrase 73,026 480 advnp Negation 26,373 1,037 neg Preposition phrase 371,927 4,471 ppmod Case marker 420,045 0 case Particle Clausal marker 47,286 0 mark Verb particle 13,078 0 prt Coordinating conjunction 131,622 0 cc Coordination Conjunct 137,128 0 conj Compound word 270,326 0 com 21

  22. Conclusion Contributions 6M+ tokens Generate a large and diverse corpus of deep dependency graphs. 20+ genres Consistent Rich Future Work Development of graph parsing models. Integration with PropBank. Logic representation. 22

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend