Morphology David Yarowsky 9/8/2020 Acknowledgements and thanks to: - - PowerPoint PPT Presentation

morphology
SMART_READER_LITE
LIVE PREVIEW

Morphology David Yarowsky 9/8/2020 Acknowledgements and thanks to: - - PowerPoint PPT Presentation

Morphology David Yarowsky 9/8/2020 Acknowledgements and thanks to: Chris Quirk Marta Costa-jussa Richard DeArmond Eleni Miltsakaki Antske Fokkens Penny Eckert Ivan Sag Eleni Miltsakaki Adam


slide-1
SLIDE 1

Morphology

David Yarowsky 9/8/2020

slide-2
SLIDE 2

Acknowledgements and thanks to:

  • Chris Quirk
  • Marta Costa-jussa
  • Richard DeArmond
  • Eleni Miltsakaki
  • Antske Fokkens
  • Penny Eckert
  • Ivan Sag
  • Eleni Miltsakaki
  • Adam Szczegielniak
  • Jeff Conn
  • Dan Jurafsky
  • Jason Eisner
slide-3
SLIDE 3

(and factored language models)

slide-4
SLIDE 4
slide-5
SLIDE 5

Morphological Analysis

morphemes or semantic features dogs => dog+s

  • r dog+PL

walking => walk+ing

  • r walk+PRS;PTCP

running => runn+ing? run+ing & n->nn (gemination) dancing => danc+ing? dance+ing & e->NULL (elision)

slide-6
SLIDE 6

Morphological Generation

morphemes

  • r semantic features

dog+s => dogs or dog+PL => dogs walk+ing => walking or walk+PRS;PTCP => walking run+ing => running or run+PRS;PTCP => runnning & n->nn (gemination) dance+ing => dancing or dance+PRS;PTCP => dancing & e->NULL (elision)

slide-7
SLIDE 7

Inflectional Morphology

morphemes or semantic features dogs => dog+s

  • r dog+PL

walking => walk+ing

  • r walk+PRS;PTCP

VERB +PRS;3SG +PRS;PTCP +PST;PFV +PST;PTCP (+s) (+ing) (+ed) (+en/+ed) walk walks walking walked walked eat eats eating ate eaten employer => employ+er

  • r employ+V:N(AGT)

employment => employ+ment or employ+V:N(RESACT) employable => un+[employ+able] (not able to be employed) [un+employ]+able (able to be not employed)

inflectional paradigm:

<= canonical affixes

<= regular grammatical feature extension

  • f same core word meaning

(“I am walking” and “I walked” differ only by tense)

slide-8
SLIDE 8

Inflectional Morphology

morphemes or semantic features dogs => dog+s

  • r dog+PL

walking => walk+ing

  • r walk+PRS;PTCP

VERB +PRS;3SG +PRS;PTCP +PST;PFV +PST;PTCP (+s) (+ing) (+ed) (+en/+ed) walk walks walking walked walked eat eats eating ate eaten employer => employ+er

  • r employ+V:N(Agent)

employment => employ+ment or employ+V:N(Result/ActOf) employable => un+[employ+able] (not able to be employed) [un+employ]+able (able to be not employed?)

Derivational Morphology (new concept formation) inflectional paradigm:

<= canonical affixes

<= is “to unemploy” a verb?

<= regular grammatical feature extension

  • f same core word meaning

(“I am walking” and “I walked” differ only by tense) “employ” = An ACTION (verb) “employer” = A PERSON (noun) a “dogfight” is not a “dog”

slide-9
SLIDE 9

Morphological Segmentation

slide-10
SLIDE 10

Morphological Parse

slide-11
SLIDE 11
slide-12
SLIDE 12
slide-13
SLIDE 13
slide-14
SLIDE 14
slide-15
SLIDE 15
slide-16
SLIDE 16
slide-17
SLIDE 17
slide-18
SLIDE 18
slide-19
SLIDE 19
slide-20
SLIDE 20
slide-21
SLIDE 21
slide-22
SLIDE 22
slide-23
SLIDE 23
slide-24
SLIDE 24
slide-25
SLIDE 25
slide-26
SLIDE 26

<= Infixation of “freakin” morpheme

slide-27
SLIDE 27
slide-28
SLIDE 28
slide-29
SLIDE 29
slide-30
SLIDE 30
slide-31
SLIDE 31
slide-32
SLIDE 32
slide-33
SLIDE 33
slide-34
SLIDE 34
slide-35
SLIDE 35

Morphology for Machine Translation

slide-36
SLIDE 36
slide-37
SLIDE 37
slide-38
SLIDE 38
slide-39
SLIDE 39
slide-40
SLIDE 40
slide-41
SLIDE 41
slide-42
SLIDE 42
slide-43
SLIDE 43
slide-44
SLIDE 44
slide-45
SLIDE 45
slide-46
SLIDE 46
slide-47
SLIDE 47
slide-48
SLIDE 48
slide-49
SLIDE 49
slide-50
SLIDE 50
slide-51
SLIDE 51
slide-52
SLIDE 52
slide-53
SLIDE 53

Morphology at JHU

slide-54
SLIDE 54

Collaborators:

Faculty: David Yarowsky, Philipp Koehn, Matt Post, Kevin Duh, Jason Eisner Senior Researchers/Postdocs:

Christo Kirov, Garrett Nicolai, Oliver Adams, John Sylak-Glassman

PhD Students:

Winston Wu, Arya McCarthy, Ryan Cotterell, Aaron Mueller, Huda Khayrallah, Patrick Xia

Masters/Undergraduates: Nidhi Vyas, John Hewitt, Roger Que, James Scharf Dylan Lewis, Lawrence Wolf-Sarkin, ++

slide-55
SLIDE 55

➢ Currently available supervised data (e.g. Wiktionary) ➢ Elicited paradigms (professional translators, Mturk) ➢ Seed data from grammars, ITG, linguistic universals ➢ Bilingual projection (e.g. from aligned Bibles) ➢ Monolingual contextual/distributional statistics

Multi-Source/Multi-Stage Morphology Learning:

Multi-Source Offline Machine Learning

Complete Learned Paradigms Human Vetting/Improvement

Run-time Executables and importable hash tables

>> DO THIS FOR 300-1600 WORLD LANGUAGES!

slide-56
SLIDE 56

GitHub distribution of Trained Morphological Analyzers AND generators for 903+ languages!

(will soon be 1100+)

Handling of finer granularity distinctions:

Analysis: hablaron -> hablar,PST,[3,PL] (perfective) hablaban -> hablar,PST,[3,PL] (imperfective) Generation: hablar,PST,3,PL -> hablaron, hablaban, … (including probabilities)

slide-57
SLIDE 57

UniMorph Feature Schema (dimensions of meaning)

slide-58
SLIDE 58

Example UniMorph uses in Information Extraction:

slide-59
SLIDE 59

Projection of POS tags and Dependency Parses (English semantic roles identify target cases; nsubj dependencies give Person/Number)

slide-60
SLIDE 60

Example Unimorph Output: Tables of English phrasal translations of inflected forms

SpInf SpRoot Unimorph Vector English Template English phrasal inflection comía comer V;IPFV;PST;1;SG I was VBG I was eating comías comer V;IPFV;PST;2;SG;INFM you were VBG you were eating comías comer V;IPFV;PST;2;SG;FORM you were VBG you were eating comía comer V;IPFV;PST;3;SG he/she/it was VBG he/she/it was eating comíamos comer V;IPFV;PST;1;PL we were VBG we were eating comíais comer V;IPFV;PST;2;PL;INFM you all were VBG you all were eating comíais comer V;IPFV;PST;2;PL you all were VBG you all were eating comían comer V;IPFV;PST;3;PL they were VBG they were eating hablaba hablar V;IPFV;PST;1;SG I was VBG I was speaking hablabas hablar V;IPFV;PST;2;SG;INFM you were VBG you were speaking hablabas hablar V;IPFV;PST;2;SG;FORM you were VBG you were speaking hablaba hablar V;IPFV;PST;3;SG he/she/it was VBG he/she/it was speaking hablábamos hablar V;IPFV;PST;1;PL we were VBG we were speaking hablais hablar V;IPFV;PST;2;PL;INFM you all were VBG you all were speaking hablais hablar V;IPFV;PST;2;PL you all were VBG you all were speaking hablaban hablar V;IPFV;PST;3;PL they were VBG they were speaking

INPUT OUTPUT

slide-61
SLIDE 61

GitHub distribution of Trained Morphological Analyzers AND generators for 903+ languages!

(will soon be 1100+)

Diverse detailed inflectional morphology

Nouns: sg/pl and case(nom/acc/dat/gen/loc/other) Verbs: tense(pst/prs/fut) +person/number(1SG,1PL,2..) Adjectives: person/number/case/gender in progress

Analysis mode:

python analyze.py -i Inflected-Zapotec.txt -a Zapotec.analyses -l zap -d Zapotec-lemma-list

Generation mode:

python analyze.py -i Zapotec-lemma-list -g -a Zapotec.generation -l zap -d Zapotec-corpus-words

slide-62
SLIDE 62

UniMorph (example of currently released languages)

slide-63
SLIDE 63

UniMorph Languages (continued)

slide-64
SLIDE 64

UniMorph Languages (continued – page #3)

slide-65
SLIDE 65

Example Unimorph Output: Tables of English phrasal translations of inflected forms

SpInf SpRoot Unimorph Vector English Template English phrasal inflection comía comer V;IPFV;PST;1;SG I was VBG I was eating comías comer V;IPFV;PST;2;SG;INFM you were VBG you were eating comías comer V;IPFV;PST;2;SG;FORM you were VBG you were eating comía comer V;IPFV;PST;3;SG he/she/it was VBG he/she/it was eating comíamos comer V;IPFV;PST;1;PL we were VBG we were eating comíais comer V;IPFV;PST;2;PL;INFM you all were VBG you all were eating comíais comer V;IPFV;PST;2;PL you all were VBG you all were eating comían comer V;IPFV;PST;3;PL they were VBG they were eating hablaba hablar V;IPFV;PST;1;SG I was VBG I was speaking hablabas hablar V;IPFV;PST;2;SG;INFM you were VBG you were speaking hablabas hablar V;IPFV;PST;2;SG;FORM you were VBG you were speaking hablaba hablar V;IPFV;PST;3;SG he/she/it was VBG he/she/it was speaking hablábamos hablar V;IPFV;PST;1;PL we were VBG we were speaking hablais hablar V;IPFV;PST;2;PL;INFM you all were VBG you all were speaking hablais hablar V;IPFV;PST;2;PL you all were VBG you all were speaking hablaban hablar V;IPFV;PST;3;PL they were VBG they were speaking

INPUT OUTPUT

slide-66
SLIDE 66

UniMorph Gloss Use for Machine Translation

slide-67
SLIDE 67

Derivational Morphology

slide-68
SLIDE 68

Derivational Morphology – Universalized Semantics

J:J(ATT) -ish J:J(DIM) -ito J:J(NEG) in- J:J(NEG) un- J:N(STATEQUALOF) -acity J:N(STATEQUALOF) -ance J:N(STATEQUALOF) -ancy J:N(STATEQUALOF) -cy J:N(STATEQUALOF) -dom J:N(STATEQUALOF) -ence J:N(STATEQUALOF) -ency J:N(STATEQUALOF) -ern J:N(STATEQUALOF) -ity J:N(STATEQUALOF) -ness J:N(STATEQUALOF) -ocity J:N(STATEQUALOF) -sion J:N(STATEQUALOF) -th J:N(STATEQUALOF) -ty J:R(INMANNER) -ily J:R(INMANNER) -ly J:V(CAUSETOBE) -ate J:V(CAUSETOBE) -en J:V(CAUSETOBE) -ify J:V(CAUSETOBE) -ize N:J(CHARBY) -some N:J(FULLOF) -ful N:J(FULLOF) -ious N:J(FULLOF) -ous N:J(HAVING) -ate N:J(HAVING) -uous N:J(LIKEA) -esque N:J(LIKEA) -ish N:J(LIKEA) -like N:J(LIKEA) -oid N:J(LIKEA) -ous N:J(MADEOF) -y N:J(QUALOF) -y N:J(REALTEDTO) -ar N:J(RELATEDTO) -al N:J(RELATEDTO) -ual N:J(RELATEDTO) -an N:J(RELATEDTO) -ary N:J(RELATEDTO) -ery N:J(RELATEDTO) -ry N:J(RELATEDTO) -ese N:J(RELATEDTO) -etic N:J(RELATEDTO) -atic N:J(RELATEDTO) -ial N:J(RELATEDTO) -ian N:J(RELATEDTO) -ian N:J(RELATEDTO) -ic N:J(RELATEDTO) -ical N:J(RELATEDTO) -ular N:J(WITHOUT) -less N:R(RELATEDTO) -ally N:N(AUG-LARGE) mega- N:N(AUG-SUPERIOR) over- N:N(AUG-SUPERIOR) super- N:N(DIM-INFERIOR) -ling N:N(DIM-SMALL) -ette N:N(DIM-SMALL) -ie N:N(DIM-SMALL) -let N:N(DIM-SMALL) -et N:N(DIM-SMALL) -y N:N(DOEROF) -ist N:N(FEM) -ess N:N(FEM) -ling N:N(SMALLINSTANCEOF) -let N:N(SMALLINSTANCEOF) -et N:N(MATERIAL) -ing N:N(REALMOF) -dom N:N(ORIGIN) -ite N:N(QUALITYOF) -ism N:N(STATEQUALOF) -dom N:N(STATEQUALOF) -hood N:N(STATEQUALOF) -ship N:N(WORKER-WITH) -man N:N(WORKER-WITH) -boy N:N(WORKER-WITH) -ier N:N(WORKER-WITH) -eer N:N(WORKER-WITH) -arian N:N(RELATEDTO) -ory N:R(INDIRECTIONOF) -ward N:R(INDIRECTIONOF) -wise N:V(CAUSETOHAVE) -ate N:V(CAUSETOHAVE) -en N:V(CAUSETOHAVE) -fy

slide-69
SLIDE 69

Paradigms for Derivational Morphology

Concept Lemma(V) V:N(AGT) V:N(PAT) V:N(RES;ACTOF) V:J(ABIL) EMPLOY employ employer employee employment employable GIVE give giver recipient gift; giving givable TRANSPORT transport transporter transportee transportation transportable INTESTIGATE investigate investigator investigated/N investigation investigable Concept Lemma(V) V:N(AGT) V:N(PAT) V:N(RES;ACTOF) V:J(ABIL) EMPLOY emplear empleador empleado empleo empleable GIVE dar dador receptor don;dar;regalo dable TRANSPORT transportar transportista transportado transporte transportable INTESTIGATE investigar investigador investigado investigación investigable

Spanish:

Concept Lemma(V) V:N(AGT) V:N(PAT) V:N(RES;ACTOF) V:J(ABIL)

EMPLOY нанимать наниматель работник работа трудоспособный GIVE давать даритель данный дарение доступный TRANSPORT

транспортировать

транспортер

транспортируемый

транспорт

транспортабельный

INTESTIGATE исследовать исследователь исследуемый исследование …

Russian:

slide-70
SLIDE 70

Derivational Morphology

Learning Process: V V:N(AGT) ENGLISH: employ employer SPANISH: emplear empleador

Via dictionary ar:ador (or learned from other pairs) 0:er (learned or in database)

MT GOAL Analysis Goal

slide-71
SLIDE 71

Questions?