Simulating Language Behavior An Introduction
C ¸a˘ grı C ¸¨
- ltekin
c.coltekin@rug.nl
Information science/Informatiekunde
Simulating Language Behavior An Introduction C a gr C oltekin - - PowerPoint PPT Presentation
Simulating Language Behavior An Introduction C a gr C oltekin c.coltekin@rug.nl Information science/Informatiekunde 2012-02-15 Tentative Plan Week Subject 1 Introduction & Organization 2 Computational simulation of
c.coltekin@rug.nl
Information science/Informatiekunde
C ¸. C ¸¨
Simulating Language Behavior 1/34
C ¸. C ¸¨
Simulating Language Behavior 1/34
Language Behavior
◮ How does a particular aspect of language comprehension?
◮ Why some sentences are harder to comprehend than others?
◮ How do we acquire language(s)?
◮ Is there a difference between learning regular or irregular
◮ How do languages change in time?
◮ What are the causes of language change, and in which ways do
C ¸. C ¸¨
Simulating Language Behavior 2/34
Language Behavior
◮ How does a particular aspect of language comprehension?
◮ Why some sentences are harder to comprehend than others?
◮ How do we acquire language(s)?
◮ Is there a difference between learning regular or irregular
◮ How do languages change in time?
◮ What are the causes of language change, and in which ways do
C ¸. C ¸¨
Simulating Language Behavior 2/34
Modeling and Simulation
◮ Galilean model of solar system. ◮ Bohr model of atom. ◮ Atmospheric models used in meteorology. ◮ Scale models of cars, bridges, buildings etc. used in
◮ Animal models used in medicine.
C ¸. C ¸¨
Simulating Language Behavior 3/34
Modeling and Simulation
◮ Why do we model things at all?
C ¸. C ¸¨
Simulating Language Behavior 4/34
Modeling and Simulation
◮ Why do we model things at all?
◮ If the model matches the reality well, we can make predictions. ◮ We learn the phenomenon better while (formally) specifying
◮ Sometimes cannot study the object of interest directly.
C ¸. C ¸¨
Simulating Language Behavior 4/34
Modeling and Simulation
◮ Why do we model things at all?
◮ If the model matches the reality well, we can make predictions. ◮ We learn the phenomenon better while (formally) specifying
◮ Sometimes cannot study the object of interest directly.
◮ Once we have the model, how do we get knowledge out of it?
C ¸. C ¸¨
Simulating Language Behavior 4/34
Modeling and Simulation
◮ Why do we model things at all?
◮ If the model matches the reality well, we can make predictions. ◮ We learn the phenomenon better while (formally) specifying
◮ Sometimes cannot study the object of interest directly.
◮ Once we have the model, how do we get knowledge out of it?
◮ Study the model analytically. ◮ Run simulations. C ¸. C ¸¨
Simulating Language Behavior 4/34
Modeling and Simulation
◮ Why do we model things at all?
◮ If the model matches the reality well, we can make predictions. ◮ We learn the phenomenon better while (formally) specifying
◮ Sometimes cannot study the object of interest directly.
◮ Once we have the model, how do we get knowledge out of it?
◮ Study the model analytically. ◮ Run simulations.
C ¸. C ¸¨
Simulating Language Behavior 4/34
Language Acquisition
◮ Human languages are complex (recursion, ambiguity). ◮ Children do not receive explicit instruction during language
◮ Language acquisition by children is (arguably) fast and robust. ◮ The input to children is not enough for learning (Poverty of
◮ Children do not receive input critical for learning certain
◮ Human languages are not learnable from positive input (Gold,
C ¸. C ¸¨
Simulating Language Behavior 5/34
Language Acquisition
C ¸. C ¸¨
Simulating Language Behavior 6/34
Language Acquisition
C ¸. C ¸¨
Simulating Language Behavior 7/34
Language Acquisition
C ¸. C ¸¨
Simulating Language Behavior 8/34
Language Acquisition
◮ Part of our linguistic abilities comes from our experience:
◮ Part of our linguistic abilities are innate: rocks and rabbits
C ¸. C ¸¨
Simulating Language Behavior 9/34
Language Acquisition
◮ Part of our linguistic abilities comes from our experience:
◮ Part of our linguistic abilities are innate: rocks and rabbits
C ¸. C ¸¨
Simulating Language Behavior 9/34
Language Acquisition
C ¸. C ¸¨
Simulating Language Behavior 10/34
Language Acquisition
◮ It is difficult to know the quantity/type of innate knowledge
C ¸. C ¸¨
Simulating Language Behavior 10/34
Language Acquisition
◮ It is difficult to know the quantity/type of innate knowledge
◮ Empirical evidence is scarce, and interpreted differently.
C ¸. C ¸¨
Simulating Language Behavior 10/34
Language Acquisition
◮ It is difficult to know the quantity/type of innate knowledge
◮ Empirical evidence is scarce, and interpreted differently. ◮ ‘Logical arguments’ are either clearly false, or misunderstood
C ¸. C ¸¨
Simulating Language Behavior 10/34
Language Acquisition
◮ After Gold’s (1967), there have been many different results in
◮ Modeling is useful, but while interpreting results of models we
◮ Is the formal grammar a good candidate for the natural
◮ Is learning method a plausible one? ◮ Is the characterization of the input match with the real-world
C ¸. C ¸¨
Simulating Language Behavior 11/34
Language Acquisition
◮ After Gold’s (1967), there have been many different results in
◮ Modeling is useful, but while interpreting results of models we
◮ Is the formal grammar a good candidate for the natural
◮ Is learning method a plausible one? ◮ Is the characterization of the input match with the real-world
C ¸. C ¸¨
Simulating Language Behavior 11/34
Language Acquisition
◮ 7, 11, 13, 17 ◮ 5, 7, 11, 13 ◮ 13, 17, 19, 23 ◮ ordered sequence of prime
C ¸. C ¸¨
Simulating Language Behavior 12/34
Language Acquisition
◮ 7, 11, 13, 17 ◮ 5, 7, 11, 13 ◮ 13, 17, 19, 23 ◮ ordered sequence of prime
◮ prime numbers
C ¸. C ¸¨
Simulating Language Behavior 12/34
Language Acquisition
◮ 7, 11, 13, 17 ◮ 5, 7, 11, 13 ◮ 13, 17, 19, 23 ◮ ordered sequence of prime
◮ prime numbers ◮ odd prime numbers
C ¸. C ¸¨
Simulating Language Behavior 12/34
Language Acquisition
◮ 7, 11, 13, 17 ◮ 5, 7, 11, 13 ◮ 13, 17, 19, 23 ◮ ordered sequence of prime
◮ prime numbers ◮ odd prime numbers ◮ just the list of randomly
C ¸. C ¸¨
Simulating Language Behavior 12/34
Language Acquisition
◮ 7, 11, 13, 17 ◮ 5, 7, 11, 13 ◮ 13, 17, 19, 23 ◮ ordered sequence of prime
◮ prime numbers ◮ odd prime numbers ◮ just the list of randomly
◮ . . .
C ¸. C ¸¨
Simulating Language Behavior 12/34
Language Acquisition
◮ 7, 11, 13, 17 ◮ 5, 7, 11, 13 ◮ 13, 17, 19, 23 ◮ ordered sequence of prime
◮ prime numbers ◮ odd prime numbers ◮ just the list of randomly
◮ . . .
C ¸. C ¸¨
Simulating Language Behavior 12/34
Language Acquisition
◮ 7, 11, 13, 17 ◮ 5, 7, 11, 13 ◮ 13, 17, 19, 23 ◮ ordered sequence of prime
◮ prime numbers ◮ odd prime numbers ◮ just the list of randomly
◮ . . .
C ¸. C ¸¨
Simulating Language Behavior 12/34
Language Acquisition
◮ 7, 11, 13, 17 ◮ 5, 7, 11, 13 ◮ 13, 17, 19, 23 ◮ ordered sequence of prime
◮ prime numbers ◮ odd prime numbers ◮ just the list of randomly
◮ . . .
C ¸. C ¸¨
Simulating Language Behavior 12/34
Language Acquisition
C ¸. C ¸¨
Simulating Language Behavior 13/34
Language Acquisition
◮ The nature–nurture debate is intriguing, yet an unresolved
C ¸. C ¸¨
Simulating Language Behavior 14/34
Language Acquisition
◮ The nature–nurture debate is intriguing, yet an unresolved
◮ It has a central role in linguistics. I believe this role is not well
C ¸. C ¸¨
Simulating Language Behavior 14/34
Language Acquisition
◮ The nature–nurture debate is intriguing, yet an unresolved
◮ It has a central role in linguistics. I believe this role is not well
◮ Linguistics is just any other domain that may contribute to the
C ¸. C ¸¨
Simulating Language Behavior 14/34
Language Acquisition
◮ The nature–nurture debate is intriguing, yet an unresolved
◮ It has a central role in linguistics. I believe this role is not well
◮ Linguistics is just any other domain that may contribute to the
◮ The contribution of the debate to the study of language is
C ¸. C ¸¨
Simulating Language Behavior 14/34
Language Acquisition
◮ The nature–nurture debate is intriguing, yet an unresolved
◮ It has a central role in linguistics. I believe this role is not well
◮ Linguistics is just any other domain that may contribute to the
◮ The contribution of the debate to the study of language is
◮ More importantly, taking a priori sides in this unresolved
C ¸. C ¸¨
Simulating Language Behavior 14/34
An example simulation: segmentation
◮ No clear acouistic markers in fluent speech. ◮ Large speaker variation in acoustic input. ◮ Noise in the environmet. ◮ Children has to start with no knwledge of words. ◮ Even with a comprehensive knowledge of words, segmentation
C ¸. C ¸¨
Simulating Language Behavior 15/34
An example simulation: segmentation
∗Example reproduced from: (Shillcock, 1995)
C ¸. C ¸¨
Simulating Language Behavior 16/34
An example simulation: segmentation
ljuuzuibutsjhiuljuuz ljuuztbzjubhbjompwfljuuz xibutuibu ljuuz epzpvxbounpsfnjmlipofz ljuuzljuuzephhjf
xibuepftbljuuztbz ephhjfeph ephhjf
xibuepftuifephhjftbz mjuumfcbczcjsejf cbczcjsejf zpvepoumjlfuibupof plbznpnnzublfuijtpvu dpx uifdpxtbztnppnpp xibuepftuifdpxtbzopnj
C ¸. C ¸¨
Simulating Language Behavior 17/34
An example simulation: segmentation
ljuuzuibutsjhiuljuuz ljuuztbzjubhbjompwfljuuz xibutuibu ljuuz epzpvxbounpsfnjmlipofz ljuuzljuuzephhjf
xibuepftbljuuztbz ephhjfeph ephhjf
xibuepftuifephhjftbz mjuumfcbczcjsejf cbczcjsejf zpvepoumjlfuibupof plbznpnnzublfuijtpvu dpx uifdpxtbztnppnpp xibuepftuifdpxtbzopnj
◮ No clear boundary markers ◮ No lexical knowledge
C ¸. C ¸¨
Simulating Language Behavior 17/34
An example simulation: segmentation
C ¸. C ¸¨
Simulating Language Behavior 18/34
An example simulation: segmentation
C ¸. C ¸¨
Simulating Language Behavior 18/34
An example simulation: segmentation
3
C ¸. C ¸¨
Simulating Language Behavior 18/34
An example simulation: segmentation
3
C ¸. C ¸¨
Simulating Language Behavior 18/34
An example simulation: segmentation
3
C ¸. C ¸¨
Simulating Language Behavior 18/34
An example simulation: segmentation
C ¸. C ¸¨
Simulating Language Behavior 19/34
An example simulation: segmentation
◮ If l help us predict r, lr is likely to be part of a word. ◮ If observing r after l is surprising it is likley that there is a
C ¸. C ¸¨
Simulating Language Behavior 19/34
An example simulation: segmentation
◮ If l help us predict r, lr is likely to be part of a word. ◮ If observing r after l is surprising it is likley that there is a
C ¸. C ¸¨
Simulating Language Behavior 19/34
An example simulation: segmentation
C ¸. C ¸¨
Simulating Language Behavior 20/34
An example simulation: segmentation
◮ Acoustic cues, such as
C ¸. C ¸¨
Simulating Language Behavior 20/34
An example simulation: segmentation
◮ Acoustic cues, such as
◮ lexical knowledge
C ¸. C ¸¨
Simulating Language Behavior 20/34
An example simulation: segmentation
◮ Acoustic cues, such as
◮ lexical knowledge ◮ phonotactics
C ¸. C ¸¨
Simulating Language Behavior 20/34
An example simulation: segmentation
◮ Acoustic cues, such as
◮ lexical knowledge ◮ phonotactics ◮ utterance boundaries
C ¸. C ¸¨
Simulating Language Behavior 20/34
An example simulation: segmentation
◮ Acoustic cues, such as
◮ lexical knowledge ◮ phonotactics ◮ utterance boundaries ◮ distributional regularities
C ¸. C ¸¨
Simulating Language Behavior 20/34
An example simulation: segmentation
◮ Acoustic cues, such as
◮ lexical knowledge ◮ phonotactics ◮ utterance boundaries ◮ distributional regularities ◮ predictability
C ¸. C ¸¨
Simulating Language Behavior 20/34
An example simulation: segmentation
◮ Transitional probability
◮ Pointwise mutual
◮ Successor value
◮ Boundary entropy
C ¸. C ¸¨
Simulating Language Behavior 21/34
An example simulation: segmentation
C ¸. C ¸¨
Simulating Language Behavior 22/34
An example simulation: segmentation
C ¸. C ¸¨
Simulating Language Behavior 22/34
An example simulation: segmentation
C ¸. C ¸¨
Simulating Language Behavior 22/34
An example simulation: segmentation
C ¸. C ¸¨
Simulating Language Behavior 22/34
An example simulation: segmentation
C ¸. C ¸¨
Simulating Language Behavior 22/34
An example simulation: segmentation
C ¸. C ¸¨
Simulating Language Behavior 22/34
An example simulation: segmentation
C ¸. C ¸¨
Simulating Language Behavior 22/34
An example simulation: segmentation
C ¸. C ¸¨
Simulating Language Behavior 22/34
An example simulation: segmentation
C ¸. C ¸¨
Simulating Language Behavior 22/34
An example simulation: segmentation
C ¸. C ¸¨
Simulating Language Behavior 22/34
An example simulation: segmentation
C ¸. C ¸¨
Simulating Language Behavior 23/34
An example simulation: segmentation
C ¸. C ¸¨
Simulating Language Behavior 24/34
An example simulation: segmentation
C ¸. C ¸¨
Simulating Language Behavior 25/34
An example simulation: segmentation
C ¸. C ¸¨
Simulating Language Behavior 26/34
An example simulation: segmentation
◮ An obvious way to segment the sequence is using a threshold
C ¸. C ¸¨
Simulating Language Behavior 27/34
An example simulation: segmentation
◮ An obvious way to segment the sequence is using a threshold
C ¸. C ¸¨
Simulating Language Behavior 27/34
An example simulation: segmentation
C ¸. C ¸¨
Simulating Language Behavior 28/34
An example simulation: segmentation
C ¸. C ¸¨
Simulating Language Behavior 28/34
An example simulation: segmentation
C ¸. C ¸¨
Simulating Language Behavior 28/34
An example simulation: segmentation
1
2
3
4
5
6
C ¸. C ¸¨
Simulating Language Behavior 29/34
An example simulation: segmentation
◮ boundaries (BP, BR, BF), ◮ word tokens (WP, WR, WF), ◮ word types or the lexicon, (LP, LR, LF).
C ¸. C ¸¨
Simulating Language Behavior 30/34
An example simulation: segmentation
C ¸. C ¸¨
Simulating Language Behavior 31/34
An example simulation: segmentation
C ¸. C ¸¨
Simulating Language Behavior 32/34
An example simulation: segmentation
C ¸. C ¸¨
Simulating Language Behavior 32/34
An example simulation: segmentation
C ¸. C ¸¨
Simulating Language Behavior 32/34
An example simulation: segmentation
◮ is in line with the psycholinguistic research, ◮ is completely unsupervised, ◮ is incremental, ◮ performs competitive with an alternative state of the art
◮ use information from utterance boundaries, ◮ keep an explicit lexicon and use it for further segmentation, ◮ make use of acoustic cues, ◮ use a better algorithm for boundary guessing.
C ¸. C ¸¨
Simulating Language Behavior 33/34
Summary & Discussion
◮ Does the simulation study help us understand and predict
C ¸. C ¸¨
Simulating Language Behavior 34/34
Summary & Discussion
◮ Does the simulation study help us understand and predict
◮ Would it contribute to nature–nurture debate?
C ¸. C ¸¨
Simulating Language Behavior 34/34
Summary & Discussion
◮ Does the simulation study help us understand and predict
◮ Would it contribute to nature–nurture debate? ◮ If we know one of the positions in the debate is correct, would
C ¸. C ¸¨
Simulating Language Behavior 34/34
Summary & Discussion
◮ Does the simulation study help us understand and predict
◮ Would it contribute to nature–nurture debate? ◮ If we know one of the positions in the debate is correct, would
◮ Clearly we assume some initial knowledge, e.g., phonemes.
C ¸. C ¸¨
Simulating Language Behavior 34/34
Summary & Discussion
◮ Does the simulation study help us understand and predict
◮ Would it contribute to nature–nurture debate? ◮ If we know one of the positions in the debate is correct, would
◮ Clearly we assume some initial knowledge, e.g., phonemes.
◮ If I knew for certain that phonemes are innate, it could have
C ¸. C ¸¨
Simulating Language Behavior 34/34
Summary & Discussion
◮ Does the simulation study help us understand and predict
◮ Would it contribute to nature–nurture debate? ◮ If we know one of the positions in the debate is correct, would
◮ Clearly we assume some initial knowledge, e.g., phonemes.
◮ If I knew for certain that phonemes are innate, it could have
◮ If I knew for certain that phonemes weren’t innate, it may
C ¸. C ¸¨
Simulating Language Behavior 34/34
Summary & Discussion
◮ Does the simulation study help us understand and predict
◮ Would it contribute to nature–nurture debate? ◮ If we know one of the positions in the debate is correct, would
◮ Clearly we assume some initial knowledge, e.g., phonemes.
◮ If I knew for certain that phonemes are innate, it could have
◮ If I knew for certain that phonemes weren’t innate, it may
◮ But does it matter if this knowledge is language specific or
C ¸. C ¸¨
Simulating Language Behavior 34/34