Variabilit et figement dexpressions polylexicales: annotation - - PowerPoint PPT Presentation

variabilit et figement d expressions polylexicales
SMART_READER_LITE
LIVE PREVIEW

Variabilit et figement dexpressions polylexicales: annotation - - PowerPoint PPT Presentation

MWEs Annotation Lexical encoding Variability measure Conclusions Projet Bibliography Variabilit et figement dexpressions polylexicales: annotation multilingue, codage lexical et mesures de variabilit Agata Savary Universit de


slide-1
SLIDE 1

MWEs Annotation Lexical encoding Variability measure Conclusions Projet Bibliography

Variabilité et figement d’expressions polylexicales: annotation multilingue, codage lexical et mesures de variabilité

Agata Savary

Université de Tours, France

Séminaire LIMSI, 27 mars 2018, Orsay

1/29

slide-2
SLIDE 2

MWEs Annotation Lexical encoding Variability measure Conclusions Projet Bibliography

Multiword expressions

Word combinations, which exhibit lexical, syntactic, semantic, pragmatic and/or statistical irregularities. Examples: all of a sudden, a hot dog, to pay a visit, to pull

  • ne’s leg

Encompass heterogeneous objects: idioms, compounds, light verb constructions, rhetorical figures, institutionalised phrases or named entities Pervasive feature: non-compositional semantics - the meaning

  • f an MWE cannot be deduced from the meanings of its

components, and from its syntactic structure, in a way deemed regular for the given language. Varying degree of syntactic variability (flexibility), especially in verbal MWEs (VMWEs).

2/29

slide-3
SLIDE 3

MWEs Annotation Lexical encoding Variability measure Conclusions Projet Bibliography

Morpho-syntactic variability of VMWEs

N tourne la page ‘N turns the page’⇒‘N stops dealing with sth.’ (More) regular properties Free subject: Jean/il/elle tourne la page ‘Jean turns the page’⇒‘Jean stops dealling with sth.’ Verb inflection: Jean tournera la page ‘Jean will turn the page’⇒‘Jean will stops dealling with sth.’ Noun modification: Jean a tourné la page de la politique ‘Jean turned the page

  • f politics’⇒‘Jean stopped dealing with politics’

Passive: La page a été tournée ‘the page was turned’⇒‘Someone stopped dealing with sth’ Determiner alternation: ?Jean tourne la/cette/une page ‘Jean turns the page’⇒‘Jean stops dealling with sth.’ . . . (More) idiosyncratic properties Lexicalized verb and object: #Jean pivote la feuille ‘Jean rotates the sheet’ No verb reduction: *La page que Jean tourne est une page ‘The page that Jean turns is a page’ No noun inflection: ?Jean a tourné plusieurs pages ‘Jean turned any pages’ . . .

3/29

slide-4
SLIDE 4

MWEs Annotation Lexical encoding Variability measure Conclusions Projet Bibliography

Scale-wise morpho-syntactic variability of VMWEs

N0V (DetN)1 expression Free subject Free verb Free object Verb reduction Verb inflection Noun inflection Noun modification Passive

  • Det. alternation

N prend la pomme ‘N takes an apple’ N prend une décision ‘N takes a decision’⇒‘N makes a decision’

  • N tourne la page

‘N turns the page’⇒‘N stops dealing with sth.’

  • ?

? N prend la porte ‘N takes the door’⇒‘N leaves (because forced)’

  • 4/29
slide-5
SLIDE 5

MWEs Annotation Lexical encoding Variability measure Conclusions Projet Bibliography

MWE variability/regularity – state of the art

lexical encoding of VMWE variability profile [Gross(1986), Mel’čuk et al.(1988)] theory-neutral lexical encoding of VMWE irregularity [Grégoire(2010), Przepiórkowski et al.(2014), McShanea et al.(2015)] MWE flexibility as a matter of scale [Gross(1988)] MWE flexibility as a result of decomposability [Nunberg et al.(1994), Sheinfux et al.(2017)] VMWEs encoding in computational grammars: HPSG [Sag et al.(2002), Copestake et al.(2002), Villavicencio et al.(2004), Bond et al.(2015), Herzig Sheinfux et al.(2015)], LFG [Attia(2006)], TAG (MWE-friendly formalism) [Abeillé and Schabes(1989), Abeillé and Schabes(1996), Vaidya et al.(2014), Lichte and Kallmeyer(2016)] MWE variant conflation in NLP [Jacquemin(2001), Krstev et al.(2014)] variability as a major challenge in NLP [Savary and Jacquemin(2003), Hachey et al.(2013), Constant et al.(2017)] restricted variability as a hint in MWE identification [Fazly et al.(2009), Tsvetkov and Wintner(2014), Buljan and Šnajder(2017)]

5/29

slide-6
SLIDE 6

MWEs Annotation Lexical encoding Variability measure Conclusions Projet Bibliography

MWE variability/regularity - challenges

annotation: few treebanks with a full-fledged VMWE annotation heterogenous annotation practices

[Rosén et al.(2015), Savary et al.(2018)]

lexical encoding: account for the irregularity of a MWE, while avoiding redundancy, mutualize VMWE lexicons processing: measure VMWE variability [Fazly et al.(2009), Pasquer et al.(2018)], conflate VMWE variants [Pasquer et al.(2018)].

6/29

slide-7
SLIDE 7

MWEs Annotation Lexical encoding Variability measure Conclusions Projet Bibliography

VMWEs annotation in the PARSEME scientific network

Methodology 22 languages (6 non Indo-European), Unified annotation guidelines as decision trees driven by linguistic tests (with examples in many languages), Universal categories, room for language-specific categories and tests, Close links with Universal Dependencies treebanking.

7/29

slide-8
SLIDE 8

MWEs Annotation Lexical encoding Variability measure Conclusions Projet Bibliography

VMWE variability in the PARSEME guidelines

Prototypical form: head verb is in active voice, finite form; other lexicalized components depend either on the verb or on another lexicalized component. elle prend une décision Meaning-preserving variants: analytical tenses: elle a pris une décision relative clauses: la décision qu’elle prend non-finite clauses: la décision prise, en prenant une décision diathesis alternation: la décision sera prise interposed modifiers: prendre une série de décisions Canonical form: prototypical or most neutral form keeping the idiomatic reading elle prend une décision les carottes sont cuites Variant neutralization during annotation: Texts contain VMWE variants, Linguistic tests apply to the canonical form.

8/29

slide-9
SLIDE 9

MWEs Annotation Lexical encoding Variability measure Conclusions Projet Bibliography

VMWE typology (v. 1.1)

Universal categories (valid for all languages): light verb constructions (LVCs) LVC.full: to give a lecture LVC.cause: to grant rights verbal idioms (VIDs) to call it a day Quasi-universal categories (valid for many languages): inherently reflexive verbs (IRVs) (FR) s’évanouir ‘to faint’ verb-particle constructions (VPCs) VPC.full to do in ‘to kill’ VPC.semi to eat up ‘to eat completely’ multi-verb constructions (MVCs) to let go Experimental (optional) category inherently adpositional verbs (IAVs) to come across sth/sb, to rely on sth/sb

9/29

slide-10
SLIDE 10

MWEs Annotation Lexical encoding Variability measure Conclusions Projet Bibliography

PARSEME VMWE corpus and shared task (v. 1.0)

Corpus 1.0 [Savary et al.(2018)] Corpus Sentences Tokens VMWEs Licence Training 230,062 4,536,603 52,724 CC v4 Testing 44,314 902,601 9,494 Shared task 1.0 [Savary et al.(2017)] & 1.1 7 systems, all languages covered, Evaluation measures including VMWE variability.

10/29

slide-11
SLIDE 11

MWEs Annotation Lexical encoding Variability measure Conclusions Projet Bibliography

XMG [Crabbé et al.(2013), Petitjean et al.(2016)]

a language

  • bject-oriented – objects, classes, inheritance

declarative – grammaticality is defined in terms of constraints rather than procedures notationally expressive - modularity, inheritance, conjunction/disjunction of tree fragments, namespaces extensible to new dimensions (semantics, frames etc.), formalisms (IG, etc.), linguistic principles (e.g. clitic ordering) a metagrammar compiler (for each tager language, here FS-LTAG) – constraint solver: produces minimal tree models respecting the constraints

11/29

slide-12
SLIDE 12

MWEs Annotation Lexical encoding Variability measure Conclusions Projet Bibliography

FrenchTAG – French XMG metagrammar [Crabbé et al.(2013)]

XMG implementation of the syntactic TAG grammar of French by [Abeillé(2002)] 285 XMG classes, 96 families (classes assigned to lexemes), compiled into 9045 TAG trees toy lexicon of 555 lexemes, including 248 verbs Example Jean prend la porte ‘John takes the door’⇒‘John leaves because he is forced to’ XMG covers literal readings (by compositionality) XMG does not cover idiomatic readings

12/29

slide-13
SLIDE 13

MWEs Annotation Lexical encoding Variability measure Conclusions Projet Bibliography

Morphology (simplified)

13/29

slide-14
SLIDE 14

MWEs Annotation Lexical encoding Variability measure Conclusions Projet Bibliography

Lemmas

Trivial classes

propename → N⋄ noun → N⋄ CliticT → CL⋄ stddeterminer → N D⋄ N∗

14/29

slide-15
SLIDE 15

MWEs Annotation Lexical encoding Variability measure Conclusions Projet Bibliography

From metagramar to parsing: n0Vn1 (Jean prend la porte)

Metagrammar tree fragments inherited by n0Vn1

CanonicalSubject → S N↓ VN activeVerbMorphology → S VN V⋄ CanonicalObject → S VN N↓

Grammar tree Derivation tree Derived tree

15/29

slide-16
SLIDE 16

MWEs Annotation Lexical encoding Variability measure Conclusions Projet Bibliography

From metagramar to parsing: n0Vn1 (Jean la prend)

Metagrammar tree fragments inherited by n0Vn1

CanonicalSubject → S N↓ VN activeVerbMorphology → S VN V⋄ CliticObject → S CL↓ VN . . .

Grammar tree Derivation tree Derived tree

16/29

slide-17
SLIDE 17

MWEs Annotation Lexical encoding Variability measure Conclusions Projet Bibliography

XMG classes

17/29

slide-18
SLIDE 18

MWEs Annotation Lexical encoding Variability measure Conclusions Projet Bibliography

Class hierarchy

TopLevelClass conjunction of classes disjunction of classes VerbalArgument CanonicalArgument SubjectAgreement NonInvertedNominalSubject Clitic RealizedNonExtractedSubject CanonicalNonSubjectArg NonReflexiveClitic VerbalMorphology . . . CliticSubject CanonicalSubject CanonicalObject CliticObject3 . . . ActiveVerbMorphology Subject Object dian0Vn1Active . . . dian0Vn1Passive . . . dian0Vn1ShortPassive n0Vn1 18/29

slide-19
SLIDE 19

MWEs Annotation Lexical encoding Variability measure Conclusions Projet Bibliography

Adding MWEs to the metagrammar [Savary et al.(sub)]

Strategy add lexical entries for MWEs with co-anchors and interface filters, reuse existing tree fragments for the (more) regular properties, decorate them with interface features, create new tree fragments for lexicalized arguments of various syntactic structures

19/29

slide-20
SLIDE 20

MWEs Annotation Lexical encoding Variability measure Conclusions Projet Bibliography

MWE lemmas with co-anchors and filters

20/29

slide-21
SLIDE 21

MWEs Annotation Lexical encoding Variability measure Conclusions Projet Bibliography

Previous tree fragments decorated with interface features

21/29

slide-22
SLIDE 22

MWEs Annotation Lexical encoding Variability measure Conclusions Projet Bibliography

New XMG classes for lexicalized arguments

22/29

slide-23
SLIDE 23

MWEs Annotation Lexical encoding Variability measure Conclusions Projet Bibliography

From metagramar to parsing: mwen0Vn1 (Jean prend la porte)

Tree fragments inherited by mwen0Vn1

CanonicalSubject → S N↓ VN activeVerbMorphology → S VN V⋄ mweCanonicalObjectLex → S VN N

(no ↓)

mweDetNoun → N D⋄ N⋄

Grammar tree Derivation tree Derived tree

23/29

slide-24
SLIDE 24

MWEs Annotation Lexical encoding Variability measure Conclusions Projet Bibliography

Modified class hierarchy

TopLevelClass conjunction of classes disjunction of classes VerbalArgument CanonicalArgument SubjectAgreement NonInvertedNominalSubject Clitic RealizedNonExtractedSubject CanonicalNonSubjectArg NonReflexiveClitic mweCanonicalObject . . . . . . CliticSubject CanonicalSubject . . . mweObjectLex mweLexDetLexNoun . . . CanonicalObject CliticObject3 . . . VerbalMorphology Subject mweSubjectLexStruct mweObjectLexStruct Object ActiveVerbMorphology mweSubject mweObject mwedian0Vn1Active . . . mwedian0Vn1Passive . . . mwedian0Vn1ShortPassive mwen0Vn1

24/29

slide-25
SLIDE 25

MWEs Annotation Lexical encoding Variability measure Conclusions Projet Bibliography

Evaluation

Corpus French PARSEME corpus Selection of 14 VMWEs (frequent) and 52 occurrences (large syntactic variety). Simplification: (i) no subordinate sentences and coordinations, (ii) few non-lexicalized arguments/modifiers. TRAIN (26 occ.), TEST (26 occ.).

FrenchTAG mweFrenchTAG with MWEs from DEV-S DEV-S + TEST-S classes 285 337 (+18%) 341 (+1.1%) MWE lemmas 5 31 37

Proof of concept Non-redundant lexical encoding of MWEs can be effectively achieved in an object-oriented metagrammar-based approach

25/29

slide-26
SLIDE 26

MWEs Annotation Lexical encoding Variability measure Conclusions Projet Bibliography

(Language-independent) syntactic and linear similarity

[Pasquer et al.(2018)]

Ils ne prennent vraiment pas une bonne décision They neg take really.adv neg.adv a.det good.adj decision

nsubj adv adv adv det amod

  • bj

root

Voici les sages décisions que Jean a aussitôt prises Here the wise decisions that.pron John.propn has.aux at once.adv taken

det amod

  • bj
  • bj

nsubj aux adv acl:relcl root

Syntactic similarity of components Sørensen–Dice coefficient: S(O1, O2) = 2 × |P(O1) ∩ P(O2)|/(|P(O1)| + |P(O2)|) On outgoing dependencies, SS(décision, décisions) =

2×|{det,amod}| |{det,amod,acl:relcl}|+|{det,amod}| = 4 5

On inserted POS. Similarity of VMWEs Syntactic: weighted average of per-component similarity. Linear: Sørensen–Dice coefficient on inserted POS. SL(V1, V2) =

2×|{adv}| |{adv,det,adj}|+|{pron,propn,aux,adv}| = 2 7

26/29

slide-27
SLIDE 27

MWEs Annotation Lexical encoding Variability measure Conclusions Projet Bibliography

Syntactic and linear variability [Pasquer et al.(2018)]

VMWE variability Rigidity of a VMWE: average of pairwise similarities of all variants Variability of a VMWE: inverse of rigidity Applications Linear variability is positively correlated with a linguistic variability benchmark in French [Tutin(2016)] Linear variability discriminates LVCs from VIDs. Linear variability discriminates idiomatic from literal readings Syntactic and linear similarity features are useful in VMWE variant identification [Pasquer et al.(sub)]

27/29

slide-28
SLIDE 28

MWEs Annotation Lexical encoding Variability measure Conclusions Projet Bibliography

Conclusions

Tensions around MWEs Variability vs. fixedness (grammatical vs. lexical encoding) Formal-grammar integration vs. theory-independence Denotational precision vs. multilingualism Solutions Object-oriented modeling - to express degrees of variability Metagrammar - (relative) theory-independence + compilation into a precision grammar for one language Iterative unified annotation methodology development in a multilingual network Language-independent MWE variability measure

28/29

slide-29
SLIDE 29

MWEs Annotation Lexical encoding Variability measure Conclusions Projet Bibliography

Projet d’intégration LIMSI (MWE++)

Mots clefs ILES Expérience AS Perspectives constitution de grands corpus EP (PARSEME, PARSEME-FR); EN (Corpus National du Polonais); coref. (CORE-PL), évén. (TEMPORAL, ODIL-FR, ISOTimeML) PARSEME-toutes EP, UD- PARSEME , universalisme campagnes d’évaluation shared task PARSEME (1.0, 1.1, . . . ) PARSEME 2.0 multilinguisme réseau PARSEME, 10 ressources, 18 langues nouveau COST ou ETN paraphrase variabilité d’EP normalisation et modélisa- tion sémantique d’EP extraction d’information reconnaissance automatique: EN (em- briquées), d’EP, termes, émotions

  • reco. de variantes d’EP mul-

tilingues langue de signes, corpus multimod. premiers contacts dans PARSEME annotation d’EP dans les langues de signes langue parlée

  • reco. emotions (Emotirob), annot. coref

& événements (ODIL), corpus ESLO émotion dans les EP approches sym- boliques et statistiques grammaires de précision, parsing sym- bolique et EP, reco. supervisée d’EN et EP apprentissage des profils de variabilité d’EP langues ré- gionales EPs en dialectes ES codage non-redondant de la variation dialectale d’EP

29/29

slide-30
SLIDE 30

MWEs Annotation Lexical encoding Variability measure Conclusions Projet Bibliography

Bibliography I

Abeillé, A. and Schabes, Y. (1989). Parsing idioms in lexicalized tags. In H. L. Somers and M. M. Wood, eds., Proceedings of the 4th Conference of the European Chapter of the ACL, EACL’89, Manchester, pp. 1–9. The Association for Computer Linguistics. Abeillé, A. and Schabes, Y. (1996). Non-compositional discontinuous constituents in Tree Adjoining Grammar. In H. Bunt and A. van Horck, eds., Discontinuous Constituency, pp. 279–306. Mouton de Gruyter, Berlin, Germany. Abeillé, A. (2002). Une grammaire électronique du français. CNRS Editions. Attia, M. A. (2006). Accommodating multiword expressions in an Arabic LFG grammar. In Proceedings of the 5th international conference on Advances in Natural Language Processing,

  • pp. 87–98, Berlin. Springer.

Bond, F., Ho, J. Q., and Flickinger, D. (2015). Feeling our way to an analysis of English possessed idioms. In S. Müller, ed., Proceedings of the 22nd International Conference on Head- Driven Phrase Structure Grammar, pp. 61–74, Stanford, CA. CSLI Publications. Buljan, M. and Šnajder, J. (2017). Combining Linguistic Features for the Detection of Croatian Multiword Expressions. In Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017), Valencia, Spain. Association for Computational Linguistics.

slide-31
SLIDE 31

MWEs Annotation Lexical encoding Variability measure Conclusions Projet Bibliography

Bibliography II

Constant, M., Eryiğit, G., Monti, J., van der Plas, L., Ramisch, C., Rosner, M., and Todirascu, A. (2017). Multiword Expression Processing: A Survey. Computational Linguistics, 43(4), 837–892. Copestake, A., Lambeau, F., Villavicencio, A., Bond, F., Baldwin, T., Sag, I. A., and Flickinger,

  • D. (2002).

Multiword expressions: linguistic precision and reusability. In Proceedings of LREC 2002. Crabbé, B., Duchier, D., Gardent, C., Roux, J. L., and Parmentier, Y. (2013). XMG: extensible metagrammar. Computational Linguistics, 39(3), 591–629. Fazly, A., Cook, P., and Stevenson, S. (2009). Unsupervised type and token identification of idiomatic expressions. Computational Linguistics, 35(1), 61–103. Grégoire, N. (2010). DuELME: a Dutch electronic lexicon of multiword expressions. Language Resources and Evaluation, 44(1-2). Gross, G. (1988). Degré de figement des noms composés. Langages, 90, 57–71. Paris : Larousse.

slide-32
SLIDE 32

MWEs Annotation Lexical encoding Variability measure Conclusions Projet Bibliography

Bibliography III

Gross, M. (1986). Lexicon-grammar: The Representation of Compound Words. In Proceedings of the 11th Coference on Computational Linguistics, pp. 1–6, Stroudsburg, PA,

  • USA. Association for Computational Linguistics.

Hachey, B., Radford, W., Nothman, J., Honnibal, M., and Curran, J. R. (2013). Evaluating Entity Linking with Wikipedia.

  • Artif. Intell., 194, 130–150.

Herzig Sheinfux, L., Arad Greshler, T., Melnik, N., and Wintner, S. (2015). Hebrew verbal multi-word expressions. In S. Müller, ed., Proceedings of the 22nd International Conference on Head-Driven Phrase Structure Grammar, Nanyang Technological University (NTU), Singapore, pp. 122–135, Stanford,

  • CA. CSLI Publications.

Jacquemin, C. (2001). Spotting and Discovering Terms through Natural Language Processing. MIT Press. Krstev, C., Obradovic, I., Utvic, M., and Vitas, D. (2014). A system for named entity recognition based on local grammars.

  • J. Log. Comput., 24(2), 473–489.

Lichte, T. and Kallmeyer, L. (2016). Same syntax, different semantics: A compositional approach to idiomaticity in multi-word expressions. In C. Piñón, ed., Empirical Issues in Syntax and Semantics 11, pp. 111–140.

slide-33
SLIDE 33

MWEs Annotation Lexical encoding Variability measure Conclusions Projet Bibliography

Bibliography IV

McShanea, M., Nirenburg, S., and Beale, S. (2015). The Ontological Semantic treatment of multiword expressions. Lingvisticæ Investigationes, 38(1), 73–110. Mel’čuk, I., Arbatchewsky-Jumarie, N., Dagenais, L., Elnitsky, L., Iordanskaja, L., Lefebvre, M.-N., and Mantha, S. (1988). Dictionnaire explicatif et combinatoire du français contemporain: Recherches lexico-sémantiques. Presses de l’Univ. de Montréal. Nunberg, G., Sag, I. A., and Wasow, T. (1994). Idioms. Language, 70, 491–538. Pasquer, C., Savary, A., Antoine, J.-Y., and Ramisch, C. (2018). Towards a Variability Measure for Multiword Expressions. In Proceedings of NAACL. Accepted paper. Pasquer, C., Savary, A., Antoine, J.-Y., and Ramisch, C. (sub.). If you’ve seen some, you’ve seen them all: Identifying variants of multiword expressions. In Proceedings of COLING 2018. Petitjean, S., Duchier, D., and Parmentier, Y. (2016). XMG 2: Describing description languages. In M. Amblard, P. de Groote, S. Pogodalla, and C. Retoré, eds., Logical Aspects of Computational Linguistics. Celebrating 20 Years of LACL (1996-2016) - 9th International Conference, LACL 2016, Nancy, France, December 5-7, 2016, Proceedings, pp. 255–272.

slide-34
SLIDE 34

MWEs Annotation Lexical encoding Variability measure Conclusions Projet Bibliography

Bibliography V

Przepiórkowski, A., Hajnicz, E., Patejuk, A., and Woliński, M. (2014). Extended phraseological information in a valence dictionary for NLP applications. In Proceedings of the Workshop on Lexical and Grammatical Resources for Language Processing (LG-LP 2014), pp. 83–91, Dublin, Ireland. Association for Computational Linguistics and Dublin City University. Rosén, V., Losnegaard, G. S., De Smedt, K., Bejček, E., Savary, A., Przepiórkowski, A., Osenova, P., and Barbu Mitetelu, V. (2015). A survey of multiword expressions in treebanks. In Proceedings of the 14th International Workshop on Treebanks & Linguistic Theories conference, Warsaw, Poland. Sag, I., Baldwin, T., Bond, F., Copestak, A., and Flickinger, D. (2002). Multiword expressions: A pain in the neck for NLP. In Proceedings of the 3rd International Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2002), p. 1–15, Mexico City,Mexico. Savary, A. and Jacquemin, C. (2003). Reducing Information Variation in Text. LNCS, 2705, 145–181. Savary, A., Ramisch, C., Cordeiro, S., Sangati, F., Vincze, V., QasemiZadeh, B., Candito, M., Cap, F., Giouli, V., Stoyanova, I., and Doucet, A. (2017). The PARSEME Shared Task on Automatic Identification of Verbal Multiword Expressions. In Proceedings of the EACL’17 Workshop on Multiword Expressions.

slide-35
SLIDE 35

MWEs Annotation Lexical encoding Variability measure Conclusions Projet Bibliography

Bibliography VI

Savary, A., Candito, M., Mititelu, V., Bejček, E., Cap, F., Čéplö, S., Cordeiro, S. R., Eryiğit, G., Giouli, V., van Gompel, M., HaCohen-Kerner, Y., Kovalevskait˙ e, J., Krek, S., Liebeskind, C., Monti, J., Parra Escartín, C., van der Plas, L., QasemiZadeh, B., Ramisch, C., Sangati, F., Stoyanova, I., and Vincze, V. (2018). Selected Extended Papers from the MWE 2017 Workhop, chapter The PARSEME multilingual corpus of verbal multiword expressions. Savary, A., Petitjean, S., Lichte, T., Kallmeyer, L., and Waszczuk, J. (sub.). Object-oriented lexical encoding of multiword expressions: Short and sweet. In Proceedings of COLING 2018. Sheinfux, L. H., Greshler, T. A., Melnik, N., and Wintner, S. (2017). Verbal MWEs: Idiomaticity and flexibility, pp. 5–38. Language Science Press, à paraître. Tsvetkov, Y. and Wintner, S. (2014). Identification of multiword expressions by combining multiple linguistic information sources. Computational Linguistics, 40(2), 449–468. Tutin, A. (2016). Comparing morphological and syntactic variations of support verb constructions and verbal full phrasemes in French: a corpus based study. In PARSEME COST Action. Relieving the pain in the neck in natural language processing: 7th final general meeting, Dubrovnik, Croatia.

slide-36
SLIDE 36

MWEs Annotation Lexical encoding Variability measure Conclusions Projet Bibliography

Bibliography VII

Vaidya, A., Rambow, O., and Palmer, M. (2014). Light verb constructions with ‘do’ and ‘be’ in Hindi: A TAG analysis. In Proceedings of the Workshop on Lexical and Grammatical Resources for Language Processing,

  • pp. 127–136.

Villavicencio, A., Copestake, A., Waldron, B., and Lambeau, F. (2004). Lexical Encoding of MWEs. In ACL Workshop on Multiword Expressions: Integrating Processing, July 2004, pp. 80–87.

slide-37
SLIDE 37

MWEs Annotation Lexical encoding Variability measure Conclusions Projet Bibliography

Insertion of adjuncts in LTAGs

slide-38
SLIDE 38

MWEs Annotation Lexical encoding Variability measure Conclusions Projet Bibliography

MWE long-distance dependencies in LTAGs

S NP↓ VP V pull NP PossD↓ N socks Part up

    Pers

1

Num

2

Gen

3

        Pers

1

Num

2

Gen

3

   