Taylors law for Human Linguistic Sequences Tatsuru Kobayashi - PowerPoint PPT Presentation

Taylor’s law for Human Linguistic Sequences Tatsuru Kobayashi Kumiko Tanaka-Ishii Research Center for Advanced Science Technology The University of Tokyo 1

Power laws of natural language 1. Vocabulary Population • Zipf’s law • Heaps’ law For “Moby Dick” 2. Burstiness ⇐ About how the words are aligned Words occur in clusters These can be analyzed through power laws Occurrences of words fluctuate Today’s talk is about quantifying the degree of fluctuation. How these could be useful will be presented at the end. 2

Fluctuation underlying text Any words (any word, any set of words) occur in clusters Occurrences of rare words in Moby Dick (below 3162th) 2500th 2000th Two ways of analysis • Fluctuation analysis • Long range correlation → weaknesses 3

Fluctuation underlying text → Look at variance in Δ𝑢 Any words (any word, any set of words) occur in clusters Occurrences of rare words in Moby Dick (below 3162th) Δ𝑢 Variance is larger when events are clustered vs. random • Fluctuation Analysis (Ebeling 1994) Two ways of analysis variance w.r.t. Δ𝑢 • Fluctuation analysis • Taylor’s analysis Our achievements • Long range correlation variance w.r.t. mean 4

Taylor’s law (Smith, 1938; Taylor, 1961) Power law between standard deviation and mean of event occurrences within (space or) time Δ𝑢 𝜏 ∝ 𝜈 ' Empirically 0.5 ≤ 𝛽 ≤ 1.0 (but 𝛽 < 0.5 is of course possible, too) Empirically known to hold in vast fields (Eisler, 2007) ecology, life science, physics, finance, human dynamics … The only application to language is Gerlach & Altmann (2014) ← not really Taylor analysis We devised a new method based on the original concept of Taylor’s law 5

� Our method Word sequence (text) 𝑥 0 𝑥 1 𝑥 0 𝑥 1 …………… Δ𝑢 Δ𝑢 1 For every word kind 𝑥 C ∈ 𝑋 4 Estimate 𝛽 using the least squares count its number of occurrence method in log scale within given length Δ𝑢 . 𝑑̂, 𝛽 5 = argmin =,' 𝜗 𝑑, 𝛽 , 2 Obtain mean 𝜈 C and E 1 standard deviation 𝜏 C of 𝑥 C . ' 1 𝜗 𝑑, 𝛽 = 𝑋 @ log 𝜏 C − log 𝑑𝜈 C . CF0 3 Plot 𝜈 C and 𝜏 C for all words. 6

Taylor’s law of natural language ‘Moby Dick’ English, 250k words, vocabulary size 20k words Taylor’s law in log scale Fluctuated - Here, Δ𝑢 ≈ 5000 . - Every point is a word kind - Estimated Taylor exponent 𝛽 = 0.57 . - Taylor exponent 𝛽 corresponds to gradient of log 𝜈 - log 𝜏 plot. Frequent 7

Taylor’s law of natural language Keywords ‘Moby Dick’ (English)’s Fluctuated Taylor’s law in log scale Functional words Frequent 8

Theoretical analysis of the exponent Empirically 0.5 ≤ 𝛽 ≤ 1.0 𝛽 = 0.5 if all words are independent and identically distributed (i.i.d.). Shuffled ‘Moby Dick’ Δ𝑢 ≈ 5000 . Taylor Exponent 𝛽 = 0.5 because shuffled text is equivalent to i.i.d. process. 9

Theoretical analysis of the exponent 𝛽 = 1.0 if words always co-occur with the same proportion. ex) Suppose that 𝑋 = {𝑥 0 , 𝑥 1 } , and 𝑥 1 occurs always twice as 𝑥 0 … … … 𝑥 0 : 17, 𝑥 1 : 34 𝑥 0 : 3, 𝑥 1 : 6 𝐦𝐩𝐡 𝝉 gradient Δ𝑢 𝛽 = 1 𝑥 1 ⟹ 𝜈 1 = 2𝜈 0 , 𝜏 1 = 2𝜏 0 log 2 𝑥 0 ⟹ 𝜏 ∝ 𝜈 𝐦𝐩𝐡 𝝂 log 2 10

Taylor’s law for other data Programming source code Child directed speech Lisp, crawled and parsed Thomas, English, CHILDES 450k words (8.2k diff. words) 3.7m words (160k diff. words) dear this platform truck insert up xload and things unless let hand 11

Datasets Kind Languages Number Average size Example of texts Gutenberg & Aozora 14(En, Fr, …) 1142 311,483 ‘Moby Dick’ (Long, single author) ‘Les Miserables’ Newspapers 3 (En,Zh,Ja) 4 580,488,956 WSJ Tagged Wiki 1 (En+tag) 1 14,637,848 enwiki8 CHILDES 10(En, Fr, …) 10 193,434 Thomas (English) Music - 12 135,993 Matthäus (Bach) Program Codes 4 4 34,161,018 C++, Lisp, Haskell, Python 12

Taylor exponents of various data kind Written Texts 0.80 mean 𝛽 = 0.58 Single author texts 0.70 𝛽 Random 0.79 0.79 Texts 0.68 0.60 0.63 𝛽 = 0.50 Other data 0.50 𝛽 ≥ 0.63 13 None of the real texts showed the exponent 0.5

Summary thus far • Taylor’s law holds in vast fields including natural/social science • Taylor’s law also holds in languages and other linguistic related sequential data • Taylor exponent shows the degree of co-occurrence among words • Taylor exponent 𝛽 differs among text categories (No such quality for Zipf’s law, Heaps’ law) How can our results be useful? ⇒ Do machine generated texts produce 𝛽 > 0.5 ? 14

Machine generated text by n -grams bigrams of Moby Dick 15

Machine generated texts by character-based LSTM language model Learning: Shakespeare by naive setting Stacked LSTM (3 LSTM layers) Generation: Probabilistic generation of succeeding characters Distribution of following character (2 million characters) LSTM 256 nodes State-of the art models present different results 128 preceding characters (in another paper) 16

Texts generated by machine translation Les Miserables translated by Les Miserables Google translator (in English) (original, French) Fluctuation that derives from the context is provided by the source text 17

Conclusion • Taylor’s law holds in vast fields including natural/social science • Taylor’s law also holds in languages and other linguistic related sequential data • Taylor exponent shows the degree of co-occurrence among words • Taylor exponent 𝛽 differs among text categories (No such quality for Zipf’s law, Heaps’ law) How can our results be useful? ⇒ Do machine generated texts produce 𝛽 > 0.5 ? • The nature of 𝛽 > 0.5 : context and long memory ← one limitation of CL • Taylor analysis would possibly evaluate machine outputs • Knowing mathematical characteristic of texts serve for language engineering 18

Thank you 19

Taylors law for Human Linguistic Sequences Tatsuru Kobayashi - PowerPoint PPT Presentation

Taylors law for Human Linguistic Sequences Tatsuru Kobayashi Kumiko Tanaka-Ishii Research Center for Advanced Science Technology The University of Tokyo 1 Power laws of natural language 1. Vocabulary Population Zipfs law

Chicken Human 1 Human 2 Rat Chicken Human 1 Human 2 Rat Chicken Human 1 Human 2 Rat

20-03-06 7. Learning Sequences/Behaviors How to use sequences/behaviors? Sequences and more

Sequences Sequences and Difference Equations "Sequences" is a central topic in

Sequences Sequences and Difference Equations "Sequences" is a central topic in

LCS 11: Cognitive Science Linguistic relativity Linguistic relativity GQ # 4.3 discussions

Master EmLex CiTIUS Design and use of linguistic tools Introduction Linguistic Analysis

Second order Taylor Second order Taylor Method Taylor expansion of y ( t + h ) about y ( t )

Institute of Law Institute of Law Institute of Law Institute of Law Law Made Simple

Statement of Ohms Law Circuit diagram of Ohms Law Formula of Ohms Law Ohms law in

Construction of covering arrays from Outline m-sequences Covering arrays Definition Research

Sequences Sequences are ordered lists of elements, e.g. 2, 3, 5, 7, 11, 13, 17, 19, . . . or a , b

Towards a Generative Model of Natural Motion C. Karen Liu University of Southern California

Studying Law at Salford Presented by: Ian King (Law UG Programme Leader) and Emma Clarke (Final

GS Law 6805: Issues in Work Law Scholarship PROFESSOR DAVID DOOREY 2013 GS Law 6805: Issues in

Modelling Cognition SE 367 : Cognitive Science Group C Nature of Linguistic Sign Linguistic

Combining linguistic and non- linguistic information in likelihood-ratio-based forensic voice

Where we are Where we are heading CLP-USA and Dnet-Bangladesh Teams July 23, 2016 Who we are

Lesson 50 Say the base and exponent for that group. Note: Students will need a calculator

SFIA THOUGHT LEADERSHIP WEBINAR September 17, 2019 Racing Ahe Racing Ahead & Keepi ad &

AB Introduction Functional data occurs for example in time series analysis, chemometry and

Estimating the Growth Rate of the Zeta Function Using Exponent Pairs Shreejit Bandyopadhyay July

Mixed Precision Training PAI Overview What is mixed-precision

Hero Acquisitions A subsidiary of HSS Hire Group plc Investor Presentation July 2017 Disclaimer

MODELING LAYERED NO X REDUCTION TECHNOLOGIES S. A. Bible, Volker Rummenhohl, Mark Siebeking, Reid

Taylors law for Human Linguistic Sequences Tatsuru Kobayashi - PowerPoint PPT Presentation

Taylors law for Human Linguistic Sequences Tatsuru Kobayashi Kumiko Tanaka-Ishii Research Center for Advanced Science Technology The University of Tokyo 1 Power laws of natural language 1. Vocabulary Population Zipfs law

Chicken Human 1 Human 2 Rat Chicken Human 1 Human 2 Rat Chicken Human 1 Human 2 Rat

20-03-06 7. Learning Sequences/Behaviors How to use sequences/behaviors? Sequences and more

Sequences Sequences and Difference Equations &quot;Sequences&quot; is a central topic in

Sequences Sequences and Difference Equations &quot;Sequences&quot; is a central topic in

LCS 11: Cognitive Science Linguistic relativity Linguistic relativity GQ # 4.3 discussions

Master EmLex CiTIUS Design and use of linguistic tools Introduction Linguistic Analysis

Second order Taylor Second order Taylor Method Taylor expansion of y ( t + h ) about y ( t )

Institute of Law Institute of Law Institute of Law Institute of Law Law Made Simple

Statement of Ohms Law Circuit diagram of Ohms Law Formula of Ohms Law Ohms law in

Construction of covering arrays from Outline m-sequences Covering arrays Definition Research

Sequences Sequences are ordered lists of elements, e.g. 2, 3, 5, 7, 11, 13, 17, 19, . . . or a , b

Towards a Generative Model of Natural Motion C. Karen Liu University of Southern California

Studying Law at Salford Presented by: Ian King (Law UG Programme Leader) and Emma Clarke (Final

GS Law 6805: Issues in Work Law Scholarship PROFESSOR DAVID DOOREY 2013 GS Law 6805: Issues in

Modelling Cognition SE 367 : Cognitive Science Group C Nature of Linguistic Sign Linguistic

Combining linguistic and non- linguistic information in likelihood-ratio-based forensic voice

Where we are Where we are heading CLP-USA and Dnet-Bangladesh Teams July 23, 2016 Who we are

Lesson 50 Say the base and exponent for that group. Note: Students will need a calculator

SFIA THOUGHT LEADERSHIP WEBINAR September 17, 2019 Racing Ahe Racing Ahead &amp; Keepi ad &amp;

AB Introduction Functional data occurs for example in time series analysis, chemometry and

Estimating the Growth Rate of the Zeta Function Using Exponent Pairs Shreejit Bandyopadhyay July

Mixed Precision Training PAI Overview What is mixed-precision

Hero Acquisitions A subsidiary of HSS Hire Group plc Investor Presentation July 2017 Disclaimer

MODELING LAYERED NO X REDUCTION TECHNOLOGIES S. A. Bible, Volker Rummenhohl, Mark Siebeking, Reid

Sequences Sequences and Difference Equations "Sequences" is a central topic in

Sequences Sequences and Difference Equations "Sequences" is a central topic in

SFIA THOUGHT LEADERSHIP WEBINAR September 17, 2019 Racing Ahe Racing Ahead & Keepi ad &