A Bayesian test of the lineage-specificity of word-order correlations
Gerhard Jäger
Tübingen University
Workshop The origins and evolution of word order
April 16, 2018
Gerhard Jäger (Tübingen) Word-order Universals Evolang 2018 1 / 23
A Bayesian test of the lineage-specificity of word-order - - PowerPoint PPT Presentation
A Bayesian test of the lineage-specificity of word-order correlations Gerhard Jger Tbingen University Workshop The origins and evolution of word order April 16, 2018 Gerhard Jger (Tbingen) Word-order Universals Evolang 2018 1 / 23
Tübingen University
April 16, 2018
Gerhard Jäger (Tübingen) Word-order Universals Evolang 2018 1 / 23
Introduction
Gerhard Jäger (Tübingen) Word-order Universals Evolang 2018 2 / 23
Introduction
Greenberg, Keenan, Lehmann etc.: general tendency for languages to be either consistently head-initial or consistently head-final alternative account (Dryer, Hawkins): phrases are consistently left- or consistently right-branching can be formalized as collection of implicative universals, such as With overwhelmingly greater than chance frequency, languages with normal SOV order are postpositional. (Greenberg’s Universal 4) both generativist and functional/historical explanations in the literature
Gerhard Jäger (Tübingen) Word-order Universals Evolang 2018 3 / 23
Introduction
(from Dunn et al., 2011) Gerhard Jäger (Tübingen) Word-order Universals Evolang 2018 4 / 23
Introduction
Gerhard Jäger (Tübingen) Word-order Universals Evolang 2018 5 / 23
The phylogenetic comparative method
Gerhard Jäger (Tübingen) Word-order Universals Evolang 2018 6 / 23
The phylogenetic comparative method
Markov process
Gerhard Jäger (Tübingen) Word-order Universals Evolang 2018 7 / 23
The phylogenetic comparative method
Markov process Phylogeny
Gerhard Jäger (Tübingen) Word-order Universals Evolang 2018 7 / 23
The phylogenetic comparative method
Markov process Phylogeny Branching process
Gerhard Jäger (Tübingen) Word-order Universals Evolang 2018 7 / 23
The phylogenetic comparative method
Gerhard Jäger (Tübingen) Word-order Universals Evolang 2018 8 / 23
The phylogenetic comparative method
Gerhard Jäger (Tübingen) Word-order Universals Evolang 2018 8 / 23
The phylogenetic comparative method
independent: the two features evolve according to independend Markov processes dependent: rates of change in one feature depends on state of the other feature
VO OV PN NP VO/PN OV/NP OV/PN VO/NP
Independent model Dependent model
Gerhard Jäger (Tübingen) Word-order Universals Evolang 2018 9 / 23
Dunn et al. (2011)
Gerhard Jäger (Tübingen) Word-order Universals Evolang 2018 10 / 23
Dunn et al. (2011)
all 28 pairs of 8 word-order features considered 4 language families: Austronesian, Bantu, Indo-European, and Uto-Aztecan main finding: wildly different results between families conclusion: word-order correlations are lineage-specific
Gerhard Jäger (Tübingen) Word-order Universals Evolang 2018 11 / 23
Universal and lineage-specific models
Gerhard Jäger (Tübingen) Word-order Universals Evolang 2018 12 / 23
Universal and lineage-specific models
1
2
3
M1 trees1 data1 M2 trees2 data2 M3 trees3 data3 M4 trees4 data4 M trees1 data1 trees2 data2 trees3 data3 trees4 data4
Gerhard Jäger (Tübingen) Word-order Universals Evolang 2018 13 / 23
Universal and lineage-specific models
ASJP word lists (Wichmann et al., 2016) feature extraction (automatic cognate detection, inter alia) ❀ character matrix Maximum-Likelihood phylogenetic inference with Glottolog (Hammarström et al., 2016) tree as backbone advantages over hand-coded Swadesh lists
applicable across language familes covers more languages than those for which expert cognate judgments are available
1004 languages in total Austronesian: 123; Bantu: 41; Indo-European: 53; Uto-Aztecan: 13 34 families with at least five languages; comprising 768 languages in total
Gerhard Jäger (Tübingen) Word-order Universals Evolang 2018 14 / 23
Universal and lineage-specific models
Gerhard Jäger (Tübingen) Word-order Universals Evolang 2018 15 / 23
Universal and lineage-specific models
Gerhard Jäger (Tübingen) Word-order Universals Evolang 2018 16 / 23
Universal and lineage-specific models
advantage: good fit of the lineage-specific data disadvantage: many parameters (8 per family for a dependent model)
universal: one set of rates (8 parameters), applying to all 4 families lineage specific: a separate set of rates for each family
Gerhard Jäger (Tübingen) Word-order Universals Evolang 2018 17 / 23
Universal and lineage-specific models
feature pair Bayes Factor Adp-N V-Obj 58.1 Adp-N N-Gen 47.2 N-Adj N-Rel 41.6 N-Gen V-Obj 36.9 Adp-N V-Subj 23.6 N-Gen N-Rel 21.9 N-Dem N-Num 20.6 Adp-N N-Rel 18.7 V-Obj N-Rel 18.1 N-Dem N-Rel 17.4 N-Rel V-Subj 14.5 N-Gen V-Subj 13.7 V-Obj V-Subj 12.1 N-Adj N-Dem 5.4 Adp-N N-Dem
N-Dem N-Gen
N-Adj N-Num
N-Adj V-Subj
N-Dem V-Obj
N-Num N-Rel
N-Adj Adp-N
N-Dem V-Subj
N-Adj V-Obj
N-Adj N-Gen
Adp-N N-Num
N-Gen N-Num
N-Num V-Subj
N-Num V-Obj
universal lineage-specific
feature pair Bayes Factor Adp-N N-Gen 115.7 Adp-N V-Obj 104.8 N-Dem N-Num 99.6 N-Adj N-Num 93.3 N-Gen V-Obj 68.0 N-Adj N-Dem 64.9 N-Adj N-Rel 48.5 N-Gen V-Subj 41.1 V-Obj V-Subj 38.2 V-Obj N-Rel 35.3 N-Dem N-Rel 33.5 Adp-N V-Subj 31.3 N-Gen N-Rel 23.8 N-Dem N-Gen 23.5 Adp-N N-Rel 22.6 N-Gen N-Num 16.5 N-Dem V-Obj 15.4 Adp-N N-Dem 15.0 N-Num V-Subj 14.4 Adp-N N-Num 13.5 N-Adj N-Gen 12.2 N-Num V-Obj 7.6 N-Rel V-Subj 6.8 N-Num N-Rel 5.0 N-Adj Adp-N 0.3 N-Adj V-Subj
N-Adj V-Obj
N-Dem V-Subj
correlated uncorrelated Gerhard Jäger (Tübingen) Word-order Universals Evolang 2018 18 / 23
Universal and lineage-specific models
V-Subj N-Adj N-Dem N-Num N-Gen Adp-N V-Obj N-Rel
Gerhard Jäger (Tübingen) Word-order Universals Evolang 2018 19 / 23
Universal and lineage-specific models
0.51 0.22 3.98 13.22 9.15 8.87 1.07 2.74
PN VO PN OV NP VO NP OV
Austronesian
0.38 0.37 4.8 3.86 4.85 3.92 4.08 4.2 NG NNum NG Num N GN NNum GN Num NBantu Indo-European
4.86 3.87 0.56 3.23 4.09 4.5 2.5 0.7 NG NNum NG Num N GN NNum GN Num N 4.41 4.61 2.63 3.76 3.46 5.8 2.02 2.14 NG NNum NG Num N GN NNum GN Num NUto-Aztecan
Gerhard Jäger (Tübingen) Word-order Universals Evolang 2018 20 / 23
Universal and lineage-specific models
De m
N-Num De m
Num
N-Re l De m
Re l-N
. 1 4 0.48 3 . 5.4 0.57 . 2 6 0.11 . 5 7N-Ge n N-Re l N-Ge n Re l-N Ge n-N N-Re l Ge n-N Re l-N
. 2 0.34 . 7 4.04 1.08 . 8 9 0.12 . 2 9N-Ge n V-Obj N-Ge n Obj-V Ge n-N V-Obj Ge n-N Obj-V
1 . 2 5 . 1 1 . 7 6 1 . 4 1 . 6 9 . 9 5 . 1 6 . 1 5N-Ge n V-Subj N-Ge n Subj-V Ge n-N V-Subj Ge n-N Subj-V
1 . 9 1 . 1 3 . 6 7 . 2 5 1 . 4 1 . 9 5 . 6 3 . 7N-Re l V-Subj N-Re l Subj-V Re l-N V-Subj Re l-N Subj-V
0.05 0.35 1.48 1.25 0.75 0.84 0.18 0.66V-Obj N-Re l V-Obj Re l-N Obj-V N-Re l Obj-V Re l-N 1.33 0.18 0.69 0.44 3.35 3.06 0.22 0.11 V-Obj V-Subj V-Obj Subj-V Obj-V V-Subj Obj-V Subj-V
Gerhard Jäger (Tübingen) Word-order Universals Evolang 2018 21 / 23
Conclusion
Gerhard Jäger (Tübingen) Word-order Universals Evolang 2018 22 / 23
Conclusion
universal vs. lineage-specific is not an absolute distinction, but a matter of degree some “classical” word-order correlation fall very close to the universal end
important to fit statistical model across language-families
Gerhard Jäger (Tübingen) Word-order Universals Evolang 2018 23 / 23
Conclusion Matthew S. Dryer. The Greenbergian word order correlations. Language, 68(1):81–138, 1992. Michael Dunn, Simon J. Greenhill, Stephen Levinson, and Russell D. Gray. Evolved structure of language shows lineage-specific trends in word-order universals. Nature, 473(7345):79–82, 2011. Harald Hammarström, Robert Forkel, Martin Haspelmath, and Sebastian Bank. Glottolog 2.7. Max Planck Institute for the Science of Human History, Jena, 2016. Available online at http://glottolog.org, Accessed on 2017-01-29. Martin Haspelmath, Matthew S. Dryer, David Gil, and Bernard Comrie. The World Atlas of Language Structures online. Max Planck Digital Library, Munich,
Sebastian Höhna, Michael J. Landis, Tracy A. Heath, Bastien Boussau, Nicolas Lartillot, Brian R. Moore, John P. Huelsenbeck, and Frederik Ronquist. Revbayes: Bayesian phylogenetic inference using graphical models and an interactive model-specification language. Systematic biology, 65(4):726–736, 2016. Elena Maslova. A dynamic approach to the verification of distributional universals. Linguistic Typology, 4(3):307–333, 2000. Mark Pagel and Andrew Meade. Bayesian analysis of correlated evolution of discrete characters by reversible-jump Markov chain Monte Carlo. The American Naturalist, 167(6):808–825, 2006. Søren Wichmann, Eric W. Holman, and Cecil H. Brown. The ASJP database (version 17). http://asjp.clld.org/, 2016. Gerhard Jäger (Tübingen) Word-order Universals Evolang 2018 23 / 23