The Marburg Agreement Project Corpus, annotation and preliminary - - PowerPoint PPT Presentation

the marburg agreement project
SMART_READER_LITE
LIVE PREVIEW

The Marburg Agreement Project Corpus, annotation and preliminary - - PowerPoint PPT Presentation

The Marburg Agreement Project Corpus, annotation and preliminary results Magnus Breder Birkenes, Stephanie Leser-Cronau (University of Marburg) Parallel text analysis in diachronic research, Marburg, 02/22/2018 Background Diachrony


slide-1
SLIDE 1

The Marburg Agreement Project

Corpus, annotation and preliminary results

Magnus Breder Birkenes, Stephanie Leser-Cronau (University of Marburg) “Parallel text analysis in diachronic research”, Marburg, 02/22/2018

slide-2
SLIDE 2

Background

  • “Diachrony of Agreement Systems: Breton – Welsh – German (and
  • ther Germanic languages)” (current research project at the

University of Marburg)

  • funding: Deutsche Forschungsgemeinschaft (DFG)
  • principal investigators: Jürg Fleischer, Paul Widmer, Erich Poppe
  • research assistants: Magnus Breder Birkenes (programming, North

Germanic), Stephanie Leser-Cronau (German / West Germanic), Kerstin Plein (Welsh), Ricarda Scherschel (Breton)

  • student assistants: Katja Daube, Canan Elif Sertkaya (German),

Lara Geinitz, Julia Vogelsang (Brittonic)

  • method: annotation and comparison of parallel texts to keep text

type and content constant, ensuring maximum comparability

slide-3
SLIDE 3

Terminology

(adapted from: Corbett 2006, p. 5)

slide-4
SLIDE 4

Two examples

(1) The committee has/have agreed (cf. Corbett 2006, p. 2) (2) Das Mädchen legt seinenn/ihrenf Mantel ab. Sief/esn trägt ein rotes Kleid ’The girl takes her coat off. She is wearing a red dress.’ (Köpcke and Zubin 2009, p. 142)

  • prone to linguistic leveling?
slide-5
SLIDE 5

Motivation

  • agreement sensitive to genre and style, as shown by Corbett (2006,

271–273) for Russian and Levin (2001) for English

  • Birkenes and Sommer (2015): collective nouns: plural agreement in

the finite verb common in religious (translational) texts

  • Fleischer (2012) and Leser-Cronau (2017): semantic agreement in
  • ral language, syntactic agreement becoming the norm in written

German

  • language change or genre effects?
slide-6
SLIDE 6

Project languages

  • Germanic languages: entire family (with a special focus on German)
  • East Germanic: Gothic
  • West Germanic: High and Low German, Dutch, Afrikaans, West /

East / North Frisian, English

  • North Germanic: Icelandic, Faroese, Norwegian, Swedish, Danish
  • Brittonic languages: Breton and Welsh
slide-7
SLIDE 7

Research questions

  • 1. Pervasiveness of agreement: How does agreement develop

diachronically in the Germanic and Brittonic languages?

  • 2. The role of mismatches: How common are agreement mismatches?
slide-8
SLIDE 8

Structure of the talk

  • 1. Introduction: Corpus and technical infrastructure (Magnus Breder

Birkenes)

  • 2. Preliminary results from the pervasiveness study (Magnus Breder

Birkenes)

  • 3. Exploring the results (Paul Widmer)
  • 4. Case study I: Mismatches in the history of German (Stephanie

Leser-Cronau)

  • 5. Case study II: Agreement-relevant Initial Consonant Mutations

(ICMs) in Welsh (Kerstin Plein)

  • 6. Conclusions
  • 7. Discussion
slide-9
SLIDE 9

Corpus and annotation

slide-10
SLIDE 10

Corpus and annotation

Bible corpus

  • the Bible as a “massively parallel” text
  • parallel texts widely used in translational studies, computational

linguistics and in typological research (e.g. Cysouw and Wälchli 2007)

  • less used in diachronic studies. Prominent exceptions: “Pragmatic

Resources in Old Indo-European Languages” (PROIEL, see Haug and Jøhndal 2008) and Biblia Medieval (http://www.bibliamedieval.es/)

  • biblical and religious texts well-attested in the transmission of the

project languages (in some better than in others)

  • pros: widely available, allows for (exact) comparison, prose text
  • cons: translation syntax, archaic structures, different translation

methods, style

slide-11
SLIDE 11

Corpus and annotation

Bible corpus (Germanic/Brittonic): 34 texts

Latin Gothic High German Low German West Frisian East Frisian North Frisian Dutch Afrikaans English Icelandic Faroese Norwegian Danish Welsh Breton

0400 0500 0600 0700 0800 0900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000

Vulgata Wulfila Tatian Beheim .. Luther 1545 Luther 2017 Bugenhagen Jessen Wumkes Saterlandic Sylt Statenvertaling 1637 Statenvertaling 1977 Bible Wycliffe King James Gammelnorsk homiliebok Guðbrandsbiblia Biblian Biblian Gammelnorsk homiliebok Reformationsbibelen Bibelen (bokmål/nynorsk) Reformationsbibelen Autoriseret Salesbury BMW BMW BMW Llafar Le Gonidec L Oliéro ..

slide-12
SLIDE 12

Corpus and annotation

Annotation

  • 1. pervasiveness study: defined text portion: “The Birth of Jesus”
  • larger portion (German, Welsh, Breton): Luke 1;5–2;35 (111 verses)
  • totally: 16 texts à ca. 2000 tokens (reference: Luther 2017)
  • smaller portion (all other Germanic languages): Luke 2;1–2;20 (20

verses)

  • totally: 18 texts à ca. 300 tokens (reference: Luther 2017)
  • 2. mismatch study: all relevant controllers
slide-13
SLIDE 13

Corpus and annotation

Data (as of February 2018)

  • German: 13.132 relations
  • Welsh: 5.048 relations
  • Breton: 4.007 relations
  • Other West Germanic languages: 1.658 relations
  • North Germanic languages: 1.724 relations

= 25.569 agreement relations

slide-14
SLIDE 14

Corpus and annotation

Annotation

  • tagging of all potential agreement forms, including those not

showing any agreement (anymore):

  • verbs (finite verbs, participles), adjectives, determiners and pronouns
  • Standard Average European bias: e.g. no object agreement not

relevant for Germanic

  • annotation:
  • controller, target: gender, number, person
  • domain: attributive, predicative, relative, anaphoric
slide-15
SLIDE 15

Corpus and annotation

Example: Luke 2:8

ENG: And there were.pl2 in the1 same1 country1 shepherds2 abiding2 in

the3 field3, keeping2

watch

  • ver

their.3pl2,4 flock4

by night.

(green = controller, blue = target, red = potential target, indices = controller id)

slide-16
SLIDE 16

Database and web application

slide-17
SLIDE 17

Database and web application

Solution

  • web application for annotating parallel texts
  • the parallel texts aligned on verse level using an alignment id (cf.

the system of Mayer and Cysouw 2014, p. 3161):

1 40001001 TAB The book of the generations

  • f Jesus

Chris ... LF 2 40001002 TAB The son of Abraham was Isaac ; and the so ... LF 3 40001003 TAB And the sons of Judah were Perez and Zerah ... LF 4 ...

  • texts and annotations stored in a SQL database
  • queries via the webapp
slide-18
SLIDE 18

Database and web application

Web application

  • built in Python/Flask, using PostgreSQL as database, programmed

by me

  • hosted on a virtual server at the university computer center in

Marburg, maintained by me

  • allows for stand-off annotation of parallelized texts using HTML and

JavaScript

  • user-friendly: point-and-click annotation
  • repetitive tasks can be automated
slide-19
SLIDE 19
slide-20
SLIDE 20

Preliminary results: Development and pervasiveness of agreement

slide-21
SLIDE 21

Preliminary results: Development and pervasiveness of agreement

Global pervasiveness

  • the overall frequency of agreement forms (absolute frequency)
  • the proportion of agreement forms with overt morphology of all

potential agreement forms with covert and overt morphology

  • agreement defined as covariation between a controller and a target

in terms of:

  • at least one feature (rough)
  • one, two and three features (fine-grained)
slide-22
SLIDE 22

Preliminary results: Development and pervasiveness of agreement

Number of potential agreement forms

131 137 195 191 197 217 203 161 210 197 199 201 189 192 191 189 195 213 124 160 174 177 188 190 192 201 188 173 185 181 185 183 187 188 175 lat got deu nds nld fry frs frr eng afr non isl fao dan nob nno cym bre 50 100 150 200 lat−vulgate−400 got−wulfila−500 deu−luther−2017 deu−luther−1912 deu−luther−1545 deu−mentel−1466 deu−beheim−1343 deu−tatian−830 nds−jessen−1937 nds−bugenhagen−1533 nld−staten−1977 nld−staten−1637 fry−wumkes−1943 frs−sater−2000 frr−sylt−2004 eng−kingjames−1611 eng−wycliffe−1388 afr−bible−1953 non−homiliebok−1200 isl−biblian−2007 isl−guðbrand−1584 fao−biblian−1949 dan−autoriseret−1992 dan−reformation−1550 nob−bibelen−2011 nno−bibelen−2011 cym−llafar−2013 cym−bwm1620−2004 cym−bwm1620−1955 cym−bwm−1588 cym−ytn−1567 bre−koad21−2010 bre−oliero−1913 bre−lecoat−1893 bre−legonidec−1827 frequency textname total

slide-23
SLIDE 23

Preliminary results: Development and pervasiveness of agreement

Global pervasiveness: absolute and relative

100 % 99 % 1 % 93 % 7 % 92 % 8 % 92 % 8 % 84 % 16 % 83 % 17 % 91 % 9 % 82 % 18 % 88 % 12 % 79 % 21 % 80 % 20 % 81 % 19 % 88 % 12 % 46 % 54 % 42 % 58 % 54 % 46 % 22 % 78 % 91 % 9 % 85 % 15 % 87 % 13 % 84 % 16 % 48 % 52 % 61 % 39 % 48 % 52 % 54 % 46 % 70 % 30 % 65 % 35 % 72 % 28 % 70 % 30 % 71 % 29 % 62 % 38 % 62 % 38 % 66 % 34 % 63 % 37 % lat got deu nds nld fry frs frr eng afr non isl fao dan nob nno cym bre 50 100 150 200 lat−vulgate−400 got−wulfila−500 deu−luther−2017 deu−luther−1912 deu−luther−1545 deu−mentel−1466 deu−beheim−1343 deu−tatian−830 nds−jessen−1937 nds−bugenhagen−1533 nld−staten−1977 nld−staten−1637 fry−wumkes−1943 frs−sater−2000 frr−sylt−2004 eng−kingjames−1611 eng−wycliffe−1388 afr−bible−1953 non−homiliebok−1200 isl−biblian−2007 isl−guðbrand−1584 fao−biblian−1949 dan−autoriseret−1992 dan−reformation−1550 nob−bibelen−2011 nno−bibelen−2011 cym−llafar−2013 cym−bwm1620−2004 cym−bwm1620−1955 cym−bwm−1588 cym−ytn−1567 bre−koad21−2010 bre−oliero−1913 bre−lecoat−1893 bre−legonidec−1827 frequency textname agreement no agreement

slide-24
SLIDE 24

Preliminary results: Development and pervasiveness of agreement

Global pervasiveness: number of features (%-scaled)

9 108 14 4 118 13 2 18 145 18 14 13 139 23 16 20 143 18 16 18 135 30 34 15 130 23 35 14 125 8 14 17 110 46 37 12 124 37 24 13 80 65 41 12 80 69 40 18 95 41 35 23 104 41 24 18 57 12 104 12 33 34 110 11 56 39 89 10 24 13 166 13 92 8 11 18 111 7 24 10 127 15 22 10 97 41 29 10 64 17 97 10 72 34 74 8 67 18 99 13 77 18 93 20 103 9 56 15 82 16 60 12 100 21 52 12 98 17 54 15 91 25 54 9 97 8 69 13 93 10 71 8 102 15 63 8 88 15 64 lat got deu nds nld fry frs frr eng afr non isl fao dan nob nno cym bre 25 50 75 100 lat−vulgate−400 got−wulfila−500 deu−luther−2017 deu−luther−1912 deu−luther−1545 deu−mentel−1466 deu−beheim−1343 deu−tatian−830 nds−jessen−1937 nds−bugenhagen−1533 nld−staten−1977 nld−staten−1637 fry−wumkes−1943 frs−sater−2000 frr−sylt−2004 eng−kingjames−1611 eng−wycliffe−1388 afr−bible−1953 non−homiliebok−1200 isl−biblian−2007 isl−guðbrand−1584 fao−biblian−1949 dan−autoriseret−1992 dan−reformation−1550 nob−bibelen−2011 nno−bibelen−2011 cym−llafar−2013 cym−bwm1620−2004 cym−bwm1620−1955 cym−bwm−1588 cym−ytn−1567 bre−koad21−2010 bre−oliero−1913 bre−lecoat−1893 bre−legonidec−1827 percent textname three two

  • ne

none

slide-25
SLIDE 25

Preliminary results: Development and pervasiveness of agreement

Global pervasiveness: distribution of domains (%-scaled)

19 79 9 24 30 68 7 30 60 59 9 53 57 58 7 51 59 61 8 53 63 59 9 52 59 55 9 45 32 67 10 37 43 58 8 63 60 56 7 50 48 57 9 44 51 55 9 46 49 50 7 48 48 59 11 50 36 1 50 18 20 41 20 50 36 6 41 26 55 32 26 68 42 44 66 42 45 58 45 43 2 46 47 23 46 43 1 49 50 5 53 28 47 56 25 45 43 32 53 48 30 49 48 32 50 49 36 52 26 34 49 33 32 51 8 34 28 52 5 26 lat got deu nds nld fry frs frr eng afr non isl fao dan nob nno cym bre 25 50 75 100 lat−vulgate−400 got−wulfila−500 deu−luther−2017 deu−luther−1912 deu−luther−1545 deu−mentel−1466 deu−beheim−1343 deu−tatian−830 nds−jessen−1937 nds−bugenhagen−1533 nld−staten−1977 nld−staten−1637 fry−wumkes−1943 frs−sater−2000 frr−sylt−2004 eng−kingjames−1611 eng−wycliffe−1388 afr−bible−1953 non−homiliebok−1200 isl−biblian−2007 isl−guðbrand−1584 fao−biblian−1949 dan−autoriseret−1992 dan−reformation−1550 nob−bibelen−2011 nno−bibelen−2011 cym−llafar−2013 cym−bwm1620−2004 cym−bwm1620−1955 cym−bwm−1588 cym−ytn−1567 bre−koad21−2010 bre−oliero−1913 bre−lecoat−1893 bre−legonidec−1827 percent textname attributive predicative relative anaphoric

slide-26
SLIDE 26

Exploring the results

slide-27
SLIDE 27

Exploring the results

All relations: Breton/Welsh & German/Low German

cym nds bre deu 1000 1250 1500 1750 2000 1000 1250 1500 1750 2000 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00

year total.perc relation

attributive phoric predicative relative

  • distinct signatures
  • unbalanced sample
slide-28
SLIDE 28

Exploring the results

All relations: Brythonic & (Low) German

Brythonic West Germanic 1000 1250 1500 1750 2000 1000 1250 1500 1750 2000 0.00 0.25 0.50 0.75 1.00

year total.perc relation

attributive phoric predicative relative

  • two distinct groups
slide-29
SLIDE 29

Exploring the results

Number: Breton/Welsh & German/Low German

cym nds bre deu 1000 1250 1500 1750 2000 1000 1250 1500 1750 2000 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00

year number.perc relation

attributive phoric predicative relative

  • language specific signature
slide-30
SLIDE 30

Exploring the results

Number: Brythonic & (Low) German

Brythonic West Germanic 1000 1250 1500 1750 2000 1000 1250 1500 1750 2000 0.00 0.25 0.50 0.75 1.00

year number.perc relation

attributive phoric predicative relative

  • phoric & predicative cluster
slide-31
SLIDE 31

Preliminary results: Mismatches

slide-32
SLIDE 32

Preliminary results: Mismatches

Mismatches

  • Mismatches = feature value-clashes between a controller and a

target (Corbett 2006)

  • Common mismatch constellations:
  • lexical hybrids: syntactic vs. semantic agreement in collective nouns

(e.g. committee, multitude) or German Mädchen ‘girl’, Weib ‘woman, wife’ (grammatically neuter, but often with feminine agreement)

  • constructional mismatches: agreement with coordinated subjects

(Sg.+Sg. = Pl.?)

slide-33
SLIDE 33

Preliminary results: Mismatches

Gender hybrids

  • Relevant previous work: Fleischer (2012) for the development of

agreement forms with the controller Weib ‘woman, wife’

  • Observation: syntactic agreement in the beginning and the end of

the German language history in the personal pronoun, fluctuation in the relative pronoun

  • Due to genre effects?
slide-34
SLIDE 34

Preliminary results: Mismatches

Gender hybrids: Bible study

  • 6 German Bible versions
  • 102 verses
  • Problem: pejoration of Weib

(3) Und siehe, ein Weib, das zwölf Jahre blutflüssig war, trat von hinten herzu und rührte die Quaste seines Kleides an denn sie sprach bei sich selbst ‘And, behold, a woman, which was diseased with an issue of blood twelve years, came behind him, and touched the hem of his garment: For she said within herself, If I may but touch his garment, I shall be whole.’ (Mat. 9:20–21; Elberfelder Bibel 1905)

slide-35
SLIDE 35

Preliminary results: Mismatches

Gender hybrids (relative pronoun): Fleischer (2012)

3 8 3 9 9 1 20 27 11 17 6 1 1 2 1 7 3 2

  • Mannh. Korp. 1+2

Karl May Goethe Grimmelshausen Luther Prosalancelot I Parzival Kaiserchronik Wiener Genesis Notker Otfrid Tatian 25 50 75 100

percent

Relative (F.) Relative (N.)

(Adapted from Fleischer 2012, p. 182)

slide-36
SLIDE 36

Preliminary results: Mismatches

Gender hybrids (relative pronoun): Bible study

5 3 5 6 5 4 1 2 7 1 3 deu−luther−2017−m deu−schlacht−2000−m deu−luther−1912−m deu−elberfelder−1905−m deu−luther−1545−m deu−mentel−1466−m deu−beheim−1343−m deu−tatian−830−m 25 50 75 100 percent feminine neuter

slide-37
SLIDE 37

Preliminary results: Mismatches

Gender hybrids (personal pronoun): Fleischer (2012)

1 1 34 3 40 79 389 17 105 82 93 26 98 4 37 2

  • Mannh. Korp. 1+2

Karl May Goethe Grimmelshausen Luther Prosalancelot I Parzival Kaiserchronik Wiener Genesis Notker Otfrid Tatian 25 50 75 100

percent

  • Personalpron. (F.)
  • Personalpron. (N.)

(Adapted from Fleischer 2012, p. 186)

slide-38
SLIDE 38

Preliminary results: Mismatches

Gender hybrids (personal pronoun): Bible study

113 4 100 115 3 118 1 79 1 44 1 deu−luther−2017−m deu−schlacht−2000−m deu−luther−1912−m deu−elberfelder−1905−m deu−luther−1545−m deu−mentel−1466−m deu−beheim−1343−m deu−tatian−830−m 25 50 75 100 percent feminine neuter

slide-39
SLIDE 39

Preliminary results: Mismatches

Gender hybrids (possessive pronoun): Fleischer (2012)

3 7 30 28 168 1 5 98 14 11 1 18 7

  • Mannh. Korp. 1+2

Karl May Goethe Grimmelshausen Luther Prosalancelot I Parzival Kaiserchronik Wiener Genesis Notker Otfrid Tatian 25 50 75 100

percent

Possessive (F.) Possessive (N.)

(adapted from Fleischer 2012, p. 187)

slide-40
SLIDE 40

Preliminary results: Mismatches

Gender hybrids (possessive pronoun): Bible study

21 15 21 21 10 8 deu−luther−2017−m deu−schlacht−2000−m deu−luther−1912−m deu−elberfelder−1905−m deu−luther−1545−m deu−mentel−1466−m deu−beheim−1343−m deu−tatian−830−m 25 50 75 100 percent feminine

slide-41
SLIDE 41

Preliminary results: Mismatches

Number hybrids

  • Relevant previous work: Birkenes and Sommer (2015) for the

development of agreement forms with collective nouns in the history

  • f German and Greek
  • Observation: semantic agreement in the predicative domain mostly

in translational texts, increase in syntactic agreement in pronouns in New High German

  • Genre effects at play?
slide-42
SLIDE 42

Preliminary results: Mismatches

Number hybrids: Bible study

  • 8 German Bible versions
  • 53 verses

(4) Dies Volk ehrt mich mit den Lippen, aber ihr Herz ist fern von mir; vergeblich dienen sie mir, weil sie lehren solche Lehren, die nichts als Menschengebote sind.« ‘This people draweth nigh unto me with their mouth, and honoureth me with their lips; but their heart is far from me. But in vain they do worship me, teaching for doctrines the commandments of men.’ (Mat. 15:8–9, Luther 2017)

slide-43
SLIDE 43

Preliminary results: Mismatches

Number hybrids (predicative): Birkenes/Sommer (2015)

110 58 5 135 42 11 132 3 57 10 74 2 67 49 29 5 65 10 34 IDS Karl May Goethe Abraham a St. Luther Bonner Corpus Prosalancelot I−IV Parzival Kaiserchronik Notker Otfrid Tatian 25 50 75 100 percent Verb (Pl.) Verb (Sg.)

(adapted from Birkenes and Sommer 2015, 202 und 209)

slide-44
SLIDE 44

Preliminary results: Mismatches

Number hybrids (predicative): Bible study

1 39 33 6 36 29 6 34 6 2 12 3 8 1 deu−luther−2017−m deu−schlacht−2000−m deu−luther−1912−m deu−elberfelder−1905−m deu−luther−1545−m deu−mentel−1466−m deu−beheim−1343−m deu−tatian−830−m 25 50 75 100 percent plural (pl) singular (sg) indifferent

slide-45
SLIDE 45

Preliminary results: Mismatches

Number hybrids (pers. pronoun): Birkenes/Sommer (2015)

4 14 8 14 9 31 12 4 81 8 23 4 17 8 15 2 37 13 6 34 3 14 4 IDS Karl May Goethe Abraham a St. Luther Bonner Corpus Prosalancelot I−IV Parzival Kaiserchronik Notker Otfrid Tatian 25 50 75 100 percent Pron (Pl.) Pron (Sg.)

slide-46
SLIDE 46

Preliminary results: Mismatches

Number hybrids (personal pronoun): Bible study

34 1 1 31 3 2 30 2 23 2 30 4 1 14 6 1 9 3 1 5 1 deu−luther−2017−m deu−schlacht−2000−m deu−luther−1912−m deu−elberfelder−1905−m deu−luther−1545−m deu−mentel−1466−m deu−beheim−1343−m deu−tatian−830−m 25 50 75 100 percent plural (pl) singular (sg) indifferent

slide-47
SLIDE 47

Preliminary results: Mismatches

Conclusion: Lexical hybrids

  • our data confirms data from previous studies in most cases
  • peak of semantic agreement in Middle High German
  • different results for personal pronouns with number hybrids
slide-48
SLIDE 48

Agreement-relevant Initial Consonant Mutations (ICMs) in Welsh

slide-49
SLIDE 49

ICM types

radical initial soft mutation aspirate mutation nasal mutation pre-vocalic (SM, lenition) (AM) (NM) aspiration (PVA) p b ph /f/ mh /m ˚ / t d th /T/ nh /n ˚ / c /k/ g ch /X/ ngh /ŋ̊/ b f /v/ m d dd /ð/ n g zero ng /ŋ/ m f /v/ rh /r ˚ / r ll /ì/ l V hV unafgected initial segments: h, s, ch /X/, si /S/, fg /f/, f /v/, n, r, l

slide-50
SLIDE 50

Agreement-relevant ICMs in the attributive domain

relevant target types

  • defjnite article: y/’r/yr (morphologically gender and number

indifgerent)

  • certain numerals: un m/f ‘one’, ordinals
  • adjectives (mostly gender indifgerent, partially number

indifgerent)

slide-51
SLIDE 51

Agreement-relevant ICMs in the attributive domain

  • masc. sg.
  • fem. sg.

plural radical SM radical y + bachgen + bach y + merch + bach y + tai + bach y bachgen bach y ferch fach y tai bach ‘the little boy’ ‘the little girl’ ‘the little houses’

slide-52
SLIDE 52

Agreement-relevant ICMs in the phoric domain

relevant target types

  • 3rd person possessives (homophonic, partially homographic)
  • 3sg: ei/’i/’w or y
  • 3pl: eu/’u/’w or y
slide-53
SLIDE 53

Agreement-relevant ICMs in the phoric domain

3sg masc. 3sg fem. 3pl SM AM, PVA PVA ei + mam ei + mam eu + mam ei fam ei mam eu mam ‘his mother’ ‘her mother’ ‘their mother’ ei + plant ei + plant eu + plant ei blant ei phlant eu plant ‘his children’ ‘her children’ ‘their children’ ei + enw ei + enw eu + enw ei enw ei henw eu henw ‘his name’ ‘her name’ ‘their name’

slide-54
SLIDE 54

The data

fjve Welsh versions of the passage of Luke 1:5–2:35

  • Y Testament Newydd, William Salesbury 1567
  • Beibl William Morgan 1588
  • Beibl William Morgan 1620, version of 1955
  • Beibl William Morgan 1620, version of 2004 (= Beibl Cymraeg

Newydd Diwygiedig)

  • Beibl Cymraeg Llafar, 2013

4418 relations belonging to the attributive, predicative, relative and phoric domains

slide-55
SLIDE 55

The data

543 1198 1457 33 153 1020 14

500 1000 1500 attributive predicative relative phoric

agreement relation type number of relations target

indifferent multirepresentation

Agreement relation types and target feature expression

slide-56
SLIDE 56

The data

543 456 742 1457 33 153 1020 14

500 1000 1500 attributive predicative relative phoric

agreement relation type number of relations target

morphologically indifferent morphologically indifferent + indifferent host anlaut multirepresentation

Agreement relation types and target feature expression

slide-57
SLIDE 57

The data

morphologically indifgerent target + indifgerent host anlaut yr Arglwydd ‘the Lord’ yr Ysbryd Glan ‘the Holy Ghost’ yr angel ‘the angel’

slide-58
SLIDE 58

The data

16.76 % 3.87 % 79.37 % 100 % 56.86 % 4.31 % 38.82 %

500 1000 1500 attributive predicative phoric

agreement relation type number of relations host anlaut

susceptible to ICM indifferent to ICM not applicable

Agreement−relevant ICMs in relations with multirepresentation

slide-59
SLIDE 59

The data

16.76 % 3.87 % 66.3 % 13.08 % 100 % 56.86 % 4.31 % 10 % 28.82 %

500 1000 1500 attributive predicative phoric

agreement relation type number of relations agreement−relevant ICM

mutation triggered mutation absent indifferent anlaut not applicable

Agreement−relevant ICMs in relations with multirepresentation

slide-60
SLIDE 60

Concluding remarks

slide-61
SLIDE 61

Concluding remarks

Conclusion

  • pervasiveness study: confirms the expected development of

agreement in the Germanic and Brittonic languages, now quantifiable

  • mismatch study: genre seems to have an effect. The developments

toward more syntactic agreement in New High German, as shown by Fleischer (2012) and Birkenes and Sommer (2015), are less clear here (especially in the “number hybrids”)

  • mutations: in 27,4% of the agreement relations in Welsh ICMs play

a part in the expression of agreement features

slide-62
SLIDE 62

Concluding remarks

For discussion

  • parallel texts are not that parallel!
  • genre and register in Bible translations, conservative style
  • effects of narrative genre: past tense overrepresented (more

agreement morphology in the present tense), unbalanced (Germanic), finite vs. non-finite constructions, passive vs. impersonal (fewer targets)

  • interdependency of Bible translations
  • same context, but not necessarily the same controller:
  • number hybrids: a plural controller (multitude vs. multitudes)
  • gender hybrids: a feminine noun (Weib/Mädchen vs. Frau, Jungfrau)
  • translation syntax? mismatches in subject-verb agreement in

Germanic

slide-63
SLIDE 63

Concluding remarks

Outlook

  • look at other texts than the Bible
  • Middle High German: Kaiserchronik, a parallel text in terms of

different manuscripts

  • Middle Welsh: Owein
  • Breton: Dictonnaire et Colloques
slide-64
SLIDE 64

Concluding remarks

Literature I

References

Birkenes, Magnus Breder and Florian Sommer (2015). “The agreement of collective nouns in the history of Ancient Greek and German.” In: Language Change at the Syntax-Semantics Interface. Ed. by Chiara Gianollo, Agnes Jäger, and Doris Penka. Berlin: de Gruyter, 183–221. Corbett, Greville G. (2006). Agreement. Cambridge University Press. Cysouw, Michael and Bernhard Wälchli, eds. (2007). Parallel texts. Berlin: Akademie Verlag. Fleischer, Jürg (2012). “Grammatische und semantische Kongruenz in der Geschichte des Deutschen: eine diachrone Studie zu den Kongruenzformen von ahd. w¯ ıb, nhd. Weib.” In: Beiträge zur Geschichte der Deutschen Sprache und Literatur 134, pp. 163–203.

slide-65
SLIDE 65

Concluding remarks

Literature II

Haug, Dag Trygve Truslew and Marius Jøhndal (2008). “Creating a Parallel Treebank of the Old Indo-European Bible Translations.” In: Proceedings of the Sixth International Language Resources and Evaluation (LREC’08). European Language Resources Association (ELRA). Paris. Köpcke, Klaus Michael and David A. Zubin (2009). “Deutsche Morphologie.” In: ed. by Elke Hentschel and Petra M. Vogel. Berlin: de Gruyter. Chap. Genus, 132–154. Leser-Cronau, Stephanie (2017). “4.4 Neutrale Kongruenzformen für Personen.” In: SyHD-atlas. URL: http://www.syhd.info/apps/atlas/#neutrum-fuer-personen. Levin, Magnus (2001). Agreement with collective nouns in English. Stockholm: Almqvist & Wiksell. Mayer, Thomas and Michael Cysouw (2014). “Creating a massively parallel Bible corpus.” In: Proceedings of LREC 2014, 3158–3163.