The PARSEME Shared Task on Automatic Identification of Verbal - - PowerPoint PPT Presentation

the parseme shared task on automatic identification of
SMART_READER_LITE
LIVE PREVIEW

The PARSEME Shared Task on Automatic Identification of Verbal - - PowerPoint PPT Presentation

Context Annotation methodology Corpora Shared task Wrapping up The PARSEME Shared Task on Automatic Identification of Verbal Multiword Expressions (1.1) C. Ramisch 1 , S. Ricardo 1 , A. Savary 2 , V. Vincze 3 , V. Barbu 4 , A. Bhatia 5 , M.


slide-1
SLIDE 1

Context Annotation methodology Corpora Shared task Wrapping up

The PARSEME Shared Task

  • n Automatic Identification of

Verbal Multiword Expressions (1.1)

  • C. Ramisch1, S. Ricardo1, A. Savary2, V. Vincze3, V. Barbu4, A. Bhatia5, M.

Buljan6, M. Candito7, P. Gantar8, V. Giouli9, T. G¨ ung¨

  • r10, A. Hawwari11, U.

I˜ nurrieta12, J. Kovalevskait˙ e13, S. Krek14, T. Lichte15, C. Liebeskind16, J. Monti17, C. Parra18, B. QasemiZadeh15, R. Ramisch19, N. Schneider20, I. Stoyanova21, A. Vaidya22, A. Walsh18

1Aix-Marseille University, France, 2University of Tours, France, 3University of Szeged, Hungary, 4Romanian Academy, Romania, 5Florida IHMC, USA, 6University of Stuttgart, Germany, 7Paris

Diderot University, France, 8Faculty of Arts, Slovenia, 9Athena Research Center, Greece,

10Bo˘

gazi¸ ci University, Turkey, 11George Washington University, USA, 12University of the Basque Country, Spain, 13Vytautas Magnus University, Lithuania, 14Joˇ zef Stefan Institute, Slovenia,

15University of D¨

usseldorf, Germany, 16Jerusalem College of Technology, Israel, 17“L’Orientale” University of Naples, Italy, 18Dublin City University, Ireland, 19Interinstitutional Center for Computational Linguistics, Brazil, 20Georgetown University, USA, 21Bulgarian Academy of Sciences, Bulgaria, 22IIT Delhi, India, 1/25

slide-2
SLIDE 2

Context Annotation methodology Corpora Shared task Wrapping up

A multilingual shared task on MWE identification

What is MWE identification? INPUT: text OUTPUT: text annotated with MWEs PARSEME shared task – edition 1.0 in 2017

2/25

slide-3
SLIDE 3

Context Annotation methodology Corpora Shared task Wrapping up

Why focus on verbal MWEs (VMWEs)? I

Discontinuity:

EN turn the TV off

Variability: morphological, syntactic, lexical

EN we made decisions vs. the decision was hard to make

Non-categorical nature: Same surface, different syntax

EN take on the task (VPC.full) vs. to sit on the chair

Same syntax, different category

EN to make a mistake (LVC.full) EN to make a meal of sth (VID)

Ambiguity: idiomatic vs. literal readings

EN to take the cake

3/25

slide-4
SLIDE 4

Context Annotation methodology Corpora Shared task Wrapping up

Why focus on verbal MWEs (VMWEs)? II

Overlaps: Factorization

EN take a walk and then a long shower (coordination)

Nesting

  • pen slots:

EN take the fact that I gave up into account

lexicalized components:

EN let the cat out of the bag

Multiword tokens

ES abstener|se (lit. abstain self ) ’abstain’ DE auf|machen (lit. out|make) ’open’

Different languages ⇒ different behavior, linguistic traditions. . .

4/25

slide-5
SLIDE 5

Context Annotation methodology Corpora Shared task Wrapping up

PARSEME shared task 1.0 at a glance

Multilingual guidelines with examples Annotation methodology and teams (PARSEME) Corpora in 18 languages under free licenses Train/test corpora with 52724/9494 VMWEs New evaluation measures (MWE-/Token-based) 7 participating systems

5/25

slide-6
SLIDE 6

Context Annotation methodology Corpora Shared task Wrapping up

Enhanced guidelines

Discussion via Gitlab issues Main definitions remain: Words and tokens Lexicalized components and open slots Canonical forms Generic decision tree based on structural tests

6/25

slide-7
SLIDE 7

Context Annotation methodology Corpora Shared task Wrapping up

Decision tree

Annotation guidelines

shared task on automatic identification of verbal MWEs - edition 1.1 (2018)

Annotation process and decision tree

We propose the following methodology for VMWE annotation: Step 1 - identify a candidate, that is, a combination of a verb with at least one other word which could form a VMWE. If the candidate has the structure of a meaning-preserving variant, the following steps apply to its canonical form. This step is largely based on the annotators' linguistic knowledge and intuition after reading this guide. Step 2 - determine which components of the candidate (or of its canonical form) are lexicalized, that is, if they are omitted, the VMWE does not occur any more. Corpus and web searches may be required to confirm intuitions about acceptable variants. Step 3 - depending on the syntactic structure of the candidate's canonical form, formally check if it is a VMWE using the generic and category-specific decision trees and tests below. Notice that your intuitions used in Step 1 to identify a given candidate are not sufficient to annotate it: you must confirm them by applying the tests in the guidelines. Step 4 (experimental and optional) - if your language team chose to experimentally annotate the IAV category follow the dedicated inherently adpositional verb (IAV) tests. These tests should always be applied once the 3 previous steps are complete, i.e. the IAV overlays the universal annotation. The decision tree below indicates the order in which tests should be applied in step 3. The decision trees are a useful summary to consult during annotation, but contain very short descriptions of the tests. Each test is detailed and explained with examples in the following sections.

Generic decision tree

If you are annotating Italian or Hindi, go to the Italian-specific decision tree or Hindi-specific decision tree. For all other languages follow the tree below. ↳Apply test S.1 - [1HEAD: Unique verb as functional syntactic head of the whole?] ↳ NO ⇒ Apply the VID-specific tests ⇒ VID tests positive? ↳ YES ⇒ Annotate as a VMWE of category VID ↳ NO ⇒ It is not a VMWE, exit ↳ YES ⇒ Apply test S.2 - [1DEP: Verb v has exactly one lexicalized dependent d?] ↳ NO ⇒ Apply the VID-specific tests ⇒ VID tests positive? ↳ YES ⇒ Annotate as a VMWE of category VID ↳ NO ⇒ It is not a VMWE, exit ↳ YES ⇒ Apply test S.3 - [LEX-SUBJ: Lexicalized subject?] ↳ YES ⇒ Apply the VID-specific tests ⇒ VID tests positive? ↳ YES ⇒ Annotate as a VMWE of category VID ↳ NO ⇒ It is not a VMWE, exit ↳ NO ⇒ Apply test S.4 - [CATEG: What is the morphosyntactic category of d?] ↳Reflexive clitic ⇒ Apply IRV-specific tests ⇒ IRV tests positive? ↳ YES ⇒ Annotate as a VMWE of category IRV ↳ NO ⇒ It is not a VMWE, exit ↳Particle ⇒ Apply VPC-specific tests ⇒ VPC tests positive? ↳ YES ⇒ Annotate as a VMWE of category VPC.full or VPC.semi ↳ NO ⇒ It is not a VMWE, exit ↳Verb with no lexicalized dependent ⇒ Apply MVC-specific tests ⇒ MVC tests positive? ↳ YES ⇒ Annotate as a VMWE of category MVC ↳ NO ⇒ Apply the VID-specific tests ⇒ VID tests positive? ↳ YES ⇒ Annotate as a VMWE of category VID ↳ NO ⇒ It is not a VMWE, exit ↳Extended NP ⇒ Apply LVC-specific decision tree ⇒ LVC tests positive? ↳ YES ⇒ Annotate as a VMWE of category LVC ↳ NO ⇒ Apply the VID-specific tests ⇒ VID tests positive? ↳ YES ⇒ Annotate as a VMWE of category VID ↳ NO ⇒ It is not a VMWE, exit ↳Another category ⇒ Apply the VID-specific tests ⇒ VID tests positive? ↳ YES ⇒ Annotate as a VMWE of category VID ↳ NO ⇒ It is not a VMWE, exit 

PARSEME Shared Task 1.1 - Annotation guidelines http://localhost/parseme-st-guidelines/1.1/index.p... 1 of 2 7/1/18, 10:42 AM

7/25

slide-8
SLIDE 8

Context Annotation methodology Corpora Shared task Wrapping up

VMWE typology I

Universal categories (all languages) verbal idioms (VID)

EN to call it a day

light-verb constructions (LVCs)

EN to give a lecture (LVC.full) EN to grant rights (LVC.cause)

8/25

slide-9
SLIDE 9

Context Annotation methodology Corpora Shared task Wrapping up

VMWE typology II

Quasi-universal categories (many languages) inherently reflexive verbs (IRVs)

EN to help oneself ’to take something freely’

verb-particle constructions (VPCs)

EN to do in ’to kill’ (VPC.full) EN to eat up (VPC.semi)

multi-verb constructions (MVCs)

HI kar le-na (lit. do take.INF) ’to do something (for one’s own

benefit)’

9/25

slide-10
SLIDE 10

Context Annotation methodology Corpora Shared task Wrapping up

VMWE typology III

Optional/language-specific categories inherently clitic verbs (LS.ICV)

IT prenderle (lit. take it) ’get beaten up’

inherently adpositional verbs (IAV)

EN to rely on

Require more work to be generalized/stabilized

10/25

slide-11
SLIDE 11

Context Annotation methodology Corpora Shared task Wrapping up

20 languages

Language groups Balto-Slavic: Bulgarian (BG), Croatian (HR), Lithuanian (LT), Polish (PL), Slovene (SL), Czech (CZ) Germanic: German (DE), English (EN), Swedish (SV) Romance: French (FR), Italian (IT), Romanian (RO), Spanish (ES), Brazilian Portuguese (PT) Others: Arabic (AR), Greek (EL), Basque (EU), Farsi (FA), Hebrew (HE), Hindi (HI), Hungarian (HU), Turkish (TR), Maltese (MT)

11/25

slide-12
SLIDE 12

Context Annotation methodology Corpora Shared task Wrapping up

Corpora

Corpus Sent. Tokens VMWE train 208,420 4,553,431 59,460 dev 31,947 672,102 9,250 test 40,471 846,798 10,616 total 280,838 6,072,331 79,326

Varying corpus sizes per language No dev in EN, HI and LT New rules for train/dev/test split Morphological/syntactic information (mostly UD) Availability 19 corpora released under Creative Commons licenses

12/25

slide-13
SLIDE 13

Context Annotation methodology Corpora Shared task Wrapping up

Format

CUPT: extension of the CoNLL-U format

1

  • PUNCT _ _ 4

punct _ _ * 2 si si

SCONJ _ _ 4

mark _ _ * 3 vous il

PRON

_ _ 4 nsubj _ _ * 4 pr´ esentez pr´ esenter

VERB

_ _ 0 root _ _ 1:LVC.full 5

  • u
  • u

CCONJ _ _ 8

cc _ _ * 6 avez avoir

AUX

_ _ 8 aux _ _ * 7 r´ ecemment r´ ecemment ADV _ _ 8 advmod _ _ * 8 pr´ esent´ e pr´ esenter

VERB

_ _ 4 conj _ _ 2:LVC.full 9 un un

DET

_ _ 10 det _ _ * 10 saignement saignement NOUN _ _ 4

  • bj

_ _ 1;2

13/25

slide-14
SLIDE 14

Context Annotation methodology Corpora Shared task Wrapping up

Corpus quality

Single-annotated for most languages Consistency checks for 19 languages Inter-annotator agreement on a sample Around 150 to 2,500 sentences Identification IAA: 0.227

ES κspan 0.984 TR

Categorization IAA: 0.573

ES κcat 1.000 HU FA

Macro-average higher than edition 1.0 κspan = 0.58 → κspan = 0.691 κcat = 0.819 → κcat = 0.836

14/25

slide-15
SLIDE 15

Context Annotation methodology Corpora Shared task Wrapping up

Shared task

Goal Automatically identify all VMWE occurrences in running text. Two tracks Closed: only use the provided training/dev data Open: use the provided data + any external resource corpora, lexicons, grammars, language models, . . . Evaluation Based on identification only Categorization quality is reported on but not ranked

15/25

slide-16
SLIDE 16

Context Annotation methodology Corpora Shared task Wrapping up

Evaluation measures (as in edition 1.0)

Per-language system evaluation Compare prediction and gold standard Precision, recall and F1-measure MWE-based scores Only predictions with the perfect span are considered to match Token-based scores Allows partial matches We consider all partial bijections from gold to system VMWEs The partial bijection maximizing the system score is chosen

16/25

slide-17
SLIDE 17

Context Annotation methodology Corpora Shared task Wrapping up

Evaluation measures (new in edition 1.1)

Cross-lingual macro-averages Token-based and MWE-based scores Phenomenon-specific scores MWE-based P/R/F1 for a subset of prediction & gold standard Only VMWEs that represent a given phenomenon Continuous vs. discontinuous Multi-token vs. single-token Seen vs. unseen (wrt. training corpus) Identical vs. variants (wrt. training corpus)

17/25

slide-18
SLIDE 18

Context Annotation methodology Corpora Shared task Wrapping up

Submitted systems

12 teams (vs. 7 in edition 1.0) From France (3), Germany (3), Ireland (1), Italy (1), Romania (1), Switzerland (1), Turkey (1), UK (1) 17 system submissions 13 closed track + 4 open track 16/17 submissions cover 3 or more languages 11/17 submissions cover 19 languages

18/25

slide-19
SLIDE 19

Context Annotation methodology Corpora Shared task Wrapping up

Techniques

Neural Parsing CRF Stat. Naive networks measures Bayes Deep-BGT Milos CRF-DepTreecateg Polirem varIDE GBD-NER MWETreeC CRF-Seq-noncateg mumpitz TRAVERSAL SHOMA TRAPACC Veyn

19/25

slide-20
SLIDE 20

Context Annotation methodology Corpora Shared task Wrapping up

Some results (more

  • nline ) I

Average MWE-based scores submission track P R F1 TRAVERSAL closed 67.58 44.97 54.00 TRAPACC-S closed 62.28 41.40 49.74 TRAPACC closed 55.68 44.67 49.57 CRF-Seq-nocategs closed 56.13 39.12 46.11 SHOMA

  • pen

66.08 51.82 58.09 System strengths: TRAVERSAL: Slavic and Romance languages TRAPACC: German and English CRF-Seq-nocategs: Hindi

20/25

slide-21
SLIDE 21

Context Annotation methodology Corpora Shared task Wrapping up

Some results (more

  • nline ) II

Languages “Easiest” languages: Hungarian (F1=90.31) and Romanian (F1=85.28) ⇒ largest training corpora “Hardest” languages: Hebrew (F1=23.28), English (F1=32.88) and Lithuanian (F1=32.17) ⇒ smallest training corpora Outlier: Hindi (F1=72.98) ⇒ MVCs are “easy” Phenomenon-specific scores Discontinuous, variant and unseen VMWEs are much harder Variants: average recall, high precision (Rmax=.56, Pmax=.86) Unseen: low recall and precision (Rmax=.38, Pmax=.32)

21/25

slide-22
SLIDE 22

Context Annotation methodology Corpora Shared task Wrapping up

Conclusions

Benchmark results Outcomes: freely available guidelines, corpora 12 teams, 17 submissions, all are multilingual There is room for improvement Findings: Inherently lexical nature of the phenomenon Unseen VMWEs are harder to generalize over than other unseen entities (e.g. NEs) VMWE identification and discovery should go hand-in-hand

22/25

slide-23
SLIDE 23

Context Annotation methodology Corpora Shared task Wrapping up

Future work

Continuous enhancement of guidelines and corpora (quality, size) IAV feedback and status New languages and language families Future shared task editions New MWE categories (e.g. nominal) Joint MWE identification and parsing or NER Synergies with other multilingual initiatives (e.g. UD)

23/25

slide-24
SLIDE 24

Context Annotation methodology Corpora Shared task Wrapping up

Acknowledgements

Funding agencies and projects:

COST action PARSEME (IC1207) Czech Republic LD-PARSEME (LD14117) ANR PARSEME-FR (ANR-14-CERA-0001) Marie Sk lodowska-Curie (Grant 713567) Science Foundation Ireland (Grant 13/RC/2106) Deutsche Forschungsgemeinschaft (CRC 991) Ministry of Human Capacities, Hungary (UNKP-17-4 New National Excellence Program) Slovenian Research Agency (J6-8256 project) DST-CSRI, Govt. of India Bo˘ gazi¸ ci University Research Fund (Grant 14420)

FLAT development Maarten van Gompel, Radboud University, The Netherlands

24/25

slide-25
SLIDE 25

Context Annotation methodology Corpora Shared task Wrapping up

Annotation teams

Balto-Slavic languages: (BG) Ivelina Stoyanova (LL), Tsvetana Dimitrova, Svetlozara Leseva, Valentina Stefanova, Maria Todorova; (HR) Maja Buljan (LL), Goranka Blagus, Ivo-Pavao Jazbec, Kristina Kocijan, Nikola Ljubeˇ si´ c, Ivana Matas, Jan ˇ Snajder; (LT) Jolanta Kovalevskait˙ e (LL), Agn˙ e Bielinskien˙ e, Loic Boizou; (PL) Agata Savary (LL), Emilia Palka-Binkiewicz; (SL) Polona Gantar (LL), Simon Krek (LL), ˇ Spela Arhar Holdt, Jaka ˇ Cibej, Teja Kavˇ ciˇ c, Taja Kuzman. Germanic languages: (DE) Timm Lichte (LL), Rafael Ehren; (EN) Abigail Walsh (LL), Claire Bonial, Paul Cook, Kristina Geeraert, John McCrae, Nathan Schneider, Clarissa Somers. Romance languages: (ES) Carla Parra Escart ´ ın (LL), Cristina Aceta, H´ ector Mart ´ ınez Alonso; (FR) Marie Candito (LL), Matthieu Constant, Carlos Ramisch, Caroline Pasquer, Yannick Parmentier, Jean-Yves Antoine, Agata Savary; (IT) Johanna Monti (LL), Valeria Caruso, Maria Pia di Buono, Antonio Pascucci, Annalisa Raffone, Anna Riccio; (RO) Verginica Barbu Mititelu (LL), Mihaela Onofrei, Mihaela Ionescu; (PT) Renata Ramisch (LL), Aline Villavicencio, Carlos Ramisch, Helena de Medeiros Caseli, Leonardo Zilio, Silvio Ricardo Cordeiro. Other languages: (AR) Abdelati Hawwari (LL), Mona Diab, Mohamed Elbadrashiny, Rehab Ibrahim; (EU) Uxoa Inurrieta (LL), Itziar Aduriz, Ainara Estarrona, Itziar Gonzalez, Antton Gurrutxaga, Ruben Urizar; (EL) Voula Giouli (LL), Vassiliki Foufi, Aggeliki Fotopoulou, Stella Markantonatou, Stella Papadelli; (FA) Behrang QasemiZadeh (LL), Shiva Taslimipoor; (HE) Chaya Liebeskind (LL), Yaakov Ha-Cohen Kerner (LL), Hevi Elyovich, Ruth Malka; (HI) Archna Bhatia (LL), Ashwini Vaidya (LL), Kanishka Jain, Vandana Puri, Shraddha Ratori, Vishakha Shukla, Shubham Srivastava; (HU) Veronika Vincze (LL), Katalin Simk´

  • , Vikt´
  • ria Kov´

acs; (TR) Tunga G¨ ung¨

  • r (LL), G¨
  • zde Berk, Berna Erden.

25/25