 
              12/1/2014 Natural Language Processing Diachronics Dan Klein – UC Berkeley Includes joint work with Alex Bouchard ‐ Cote, Tom Griffiths, and David Hall 1
12/1/2014 The Task 2
12/1/2014 Lexical Reconstruction Latin focus French Spanish Italian Portuguese feu fuego fuoco fogo 3
12/1/2014 Tree of Languages  We assume the phylogeny is known  Much work in biology, e.g. work by Warnow, Felsenstein, Steele…  Also in linguistics, e.g. Warnow et al., Gray and Atkinson… http://andromeda.rutgers.edu/~jlynch/language.html 4
12/1/2014 Evolution through Sound Changes Eng. camera from Latin, “camera obscura” camera / kamera / Latin Deletion: / e /, / a / Change: / k / .. / t ṏ / .. / ṏ / Insertion: / b / chambre / ṏ amb Й / French Eng. chamber from Old Fr. before the initial / t / dropped 5
12/1/2014 Changes are Systematic camera / kamera / numerus / numerus / e  _ e  _ camra / kamra / numrus / numrus / 6
12/1/2014 Changes are Contextual camera / kamera / e  _ e  _ / after stress camra / kamra / 7
12/1/2014 Changes Have Structure camra / kamra / _  b _  b / m_r _  [ stop x ] / [ nasal x ]_r cambra / kambra / 8
12/1/2014 Changes are Systematic English Great Vowel Shift (Simplified!) “time” = teem “time” = taim i e a 9
12/1/2014 Diachronic Evidence Yahoo! Answers [ca 2000] Appendix Probi [ca 300] tonight not tonite tonitru non tonotru 10
12/1/2014 Synchronic (Comparative) Evidence Key idea: changes occur uniformly across the lexicon 11
12/1/2014 The Data 12
12/1/2014 The Data  Data sets  Small: Romance  French, Italian, Portuguese, Spanish  2344 words  Complete cognate sets FR IT PT ES  Target: (Vulgar) Latin 13
12/1/2014 The Data  Data sets  Small: Romance  French, Italian, Portuguese, Spanish  2344 words  Complete cognate sets FR IT PT ES  Target: (Vulgar) Latin  Large: Austronesian  637 languages  140K words  Incomplete cognate sets  Target: Proto ‐ Austronesian 14
12/1/2014 Austronesian 15
12/1/2014 Austronesian Examples From the Austronesian Basic Vocabulary Database 16
12/1/2014 The Model 17
12/1/2014 Simple Model: Single Characters G G C G G C C C C C G G [cf. Felsenstein 81] 18
12/1/2014 Changes are Systematic /fokus/ /fokus/ /kentrum/ /fogo/ /fogo/ /sentro/ /fw Ꜽ ko/ /fwe ɋ o/ /fogo/ /fw Ꜽ ko/ /fwe ɋ o/ /fogo/ /t ṏ ƌ ntro/ /sentro/ /sentro/ 19
12/1/2014 Parameters are Branch ‐ Specific focus  ES  IB LA /fokus/  IT  PT /fogo/ IB fuoco fuego fogo /fw Ꜽ ko/ /fwe ɋ o/ /fogo/ IT ES PT [Bouchard ‐ Cote, Griffiths, Klein, 07] 20
12/1/2014 Edits are Contextual, Structured o # f /fokus/ Ꜽ w # f  IT /fw Ꜽ ko/ 21
12/1/2014 Inference 22
12/1/2014 Learning: Objective /fokus/ z /fogo/ /fw Ꜽ ko/ /fwe ɋ o/ /fogo/ w 23
12/1/2014 Learning: EM  M ‐ Step  Find parameters which fit /fokus/ (expected) sound change counts /fogo/  Easy: gradient ascent on theta /fw Ꜽ ko/ /fwe ɋ o/ /fogo/  E ‐ Step  Find (expected) change /fokus/ counts given parameters  Hard: variables are string ‐ /fogo/ valued /fw Ꜽ ko/ /fwe ɋ o/ /fogo/ 24
12/1/2014 Computing Expectations Standard approach, e.g. [Holmes 2001]: Gibbs sampling each sequence ‘grass’ [Holmes 01, Bouchard ‐ Cote, Griffiths, Klein 07] 25
12/1/2014 A Gibbs Sampler ‘grass’ 26
12/1/2014 A Gibbs Sampler ‘grass’ 27
12/1/2014 A Gibbs Sampler ‘grass’ 28
12/1/2014 Getting Stuck ? How could we jump to a state where the liquids /r/ and /l/ have a common ancestor? 29
12/1/2014 Getting Stuck 30
12/1/2014 Efficient Sampling: Vertical Slices Single Sequence Resampling Ancestry Resampling [Bouchard ‐ Cote, Griffiths, Klein, 08] 31
12/1/2014 Results 32
12/1/2014 Results: Romance 33
12/1/2014 Learned Rules / Mutations 34
12/1/2014 Learned Rules / Mutations 35
12/1/2014 Results: Austronesian 36
12/1/2014 Examples: Austronesian [Bouchard ‐ Cote, Hall, Griffiths, Klein, 13] 37
12/1/2014 Result: More Languages Help Distance from Blust [1993] Reconstructions Mean edit distance Number of modern languages used 38
12/1/2014 Visualization: Learned Universals *The model did not have features encoding natural classes 39
12/1/2014 Regularity and Functional Load In a language, some pairs of sounds are more contrastive than others (higher functional load) Example: English p/d versus t/th High Load: p/d: pot/dot, pin/din dress/press, pew/dew, ... Low Load: t/th: thin/tin 40
12/1/2014 Functional Load: Timeline 1955: Functional Load Hypothesis (FLH): Sound changes are less frequent when they merge phonemes with high functional load [Martinet, 55] 1967: Previous research within linguistics: “FLH does not seem to be supported by the data” [King, 67] (Based on 4 languages as noted by [Hocket, 67; Surandran et al., 06]) Our approach: we reexamined the question with two orders of magnitude more data [Bouchard ‐ Cote, Hall, Griffiths, Klein, 13] 41
12/1/2014 Regularity and Functional Load Data: only 4 languages from the Austronesian data Merger posterior probability Each dot is a sound change identified by the system Functional load as computed by [King, 67] 42
12/1/2014 Regularity and Functional Load Data: all 637 languages from the Austronesian data Merger posterior probability Functional load as computed by [King, 67] 43
12/1/2014 Extensions 44
12/1/2014 Cognate Detection ‘fire’  /fw Ꜽ ko/ /v ƌ rbo/ /t ṏ ƌ ntro/ /sentro/ /ber Ǎ o/ /fwe ɋ o/ /v ƌ rbo/ /fogo/ /s ƌ ntro/ [Hall and Klein, 11] 45
12/1/2014 Grammar Induction GL Avg rel gain: 29% IE G RM 70 WG NG 60 50 Portuguese Swedish Chinese Spanish Slovene English Danish Dutch 40 30 20 10 0 [Berg ‐ Kirkpatrick and Klein, 07] 46
12/1/2014 Language Diversity Why are the languages of the world so similar? Universal grammar answer: Hardware constraints Common source answer: Not much time has passed [Rafferty, Griffiths, and Klein, 09] 47
Recommend
More recommend