Informativeness: A review of work by Regier and colleagues (and a - PowerPoint PPT Presentation

Informativeness: A review of work by Regier and colleagues (and a response) Jon W. Carr Centre for Language Evolution School of Philosophy, Psychology and Language Sciences University of Edinburgh

What shapes language? compressibility expressivity Language Learning Communication simplicity informativeness

How do learning and communication shape the structure of semantic categories?

a pressure for simplicity How do learning and communication shape the structure of semantic categories? a pressure for informativeness ✗ ✔

Kinship terms are simple and informative Kemp & Regier (2012) English Northern Paiute 4 3 ⬅ Informative 2 1 0 0 50 100 ⬅ Simple

Learning and communication in the CLE framework 4 3 ⬅ Informative 2 1 0 0 50 100 ⬅ Simple

Learning and communication in the CLE framework tuge tuge tuge 4 tuge tuge tuge tuge tuge tuge tupim tupim tupim Learning miniku miniku miniku tupin tupin tupin 3 poi poi poi poi poi poi poi poi poi ⬅ Informative Kirby, Cornish, & Smith (2008) 2 1 0 0 50 100 ⬅ Simple

Learning and communication in the CLE framework tuge tuge tuge 4 tuge tuge tuge tuge tuge tuge tupim tupim tupim Learning miniku miniku miniku pihino nemone piga kawake tupin tupin tupin 3 poi poi poi poi poi poi poi poi poi ⬅ Informative kapa gakho wuwele nepi Kirby, Cornish, & Smith (2008) 2 newhomo kamone gaku hokako Kirby, Tamariz, Cornish, & Smith (2015) 1 Communication 0 0 50 100 ⬅ Simple

Learning and communication in the CLE framework tuge tuge tuge 4 tuge tuge tuge tuge tuge tuge tupim tupim tupim Learning miniku miniku miniku pihino nemone piga kawake tupin tupin tupin 3 poi poi poi poi poi poi poi poi poi ⬅ Informative kapa gakho wuwele nepi Kirby, Cornish, & Smith (2008) 2 newhomo kamone gaku hokako Kirby, Tamariz, Cornish, & Smith (2015) Kirby, Tamariz, Cornish, & Smith (2015) 1 Learning and communication egewawu egewawa egewuwu ege Communication 0 mega megawawa megawuwu wulagi 0 50 100 ⬅ Simple gamenewawu gamenewawa gamenewuwu gamene

Summary Pressure from learning Pressure from communication Compressibility: To what Expressivity: How many extent can the language meaning distinctions does CLE be compressed? the language allow? Measure: MDL, gzip, Measure: Number of entropy words Simplicity: How many Informativeness: How words does an individual effectively can a meaning Regier need to remember? be transmitted? Measure: Number of Measure: Communicative words, number of rules cost

Summary Pressure from learning Pressure from communication Compressibility: To what Informativeness: How extent can the language effectively can a meaning be compressed? be transmitted? Measure: MDL, gzip, Measure: Communicative entropy cost bits required to represent the language bits lost during communication

Communicative cost

Communicative cost: High-level overview

Communicative cost: Low-level details To compute the cost of a category partition, we start by considering a individual target meaning and compute how much error would be incurred in trying to reconstruct that target Reconstruction error is defined as the Kullback-Leibler divergence between s and l : s ( i ) 1 � D KL ( s || l ) = s ( i ) log 2 l ( i ) = log 2 l ( t ) i ∈ U Summing the divergences for all targets yields the communicative cost for the partition: � k = p ( t ) D KL ( s || l ) t ∈ U 1 � k = p ( t ) log 2 l ( t ) t ∈ U

Communicative cost: Example of a discrete categorizer 1 � universe U = { i 1 , i 2 , ..., i 16 } k = p ( t ) log 2 l ( t ) t ∈ U category partition P = { C 1 , C 2 , C 3 , C 4 } 1 1 = {{ i 1 , i 2 , i 3 , i 4 } , { i 5 , i 6 , i 7 , i 8 } , { i 9 , i 10 , i 11 , i 12 } , { i 13 , i 14 , i 15 , i 16 }} � = 16 log 2 1 / 4 t ∈ U speaker’s lexicon S = { C 1 → �� , C 2 → �� , C 3 → �� , C 4 → �� } = 16( 1 1 16 log 2 1 / 4) listener’s lexicon L = { �� → C 1 , �� → C 2 , �� → C 3 , �� → C 4 } 1 p = [ 1 16 , 1 16 , 1 16 , 1 16 , 1 16 , 1 16 , 1 16 , 1 16 , 1 16 , 1 16 , 1 16 , 1 16 , 1 16 , 1 16 , 1 16 , 1 = log 2 need probabilities 16] 1 / 4 = log 2 4 speaker distributions s 1 = [1 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0] (for each meaning) s 2 = [0 , 1 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0] = 2 bits ... s 16 = [0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 1] l C 1 = [ 1 4 , 1 4 , 1 4 , 1 listener distributions 4 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0] (for each category) l C 2 = [0 , 0 , 0 , 0 , 1 4 , 1 4 , 1 4 , 1 4 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0] l C 3 = [0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 1 4 , 1 4 , 1 4 , 1 4 , 0 , 0 , 0 , 0] l C 4 = [0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 1 4 , 1 4 , 1 4 , 1 4 ]

Communicative cost: Example of a discrete categorizer 1 � universe U = { i 1 , i 2 , ..., i 16 } k = p ( t ) log 2 l ( t ) t ∈ U category partition P = { C 1 , C 2 , C 3 , C 4 } 1 1 Why 2 bits? = {{ i 1 , i 2 , i 3 , i 4 } , { i 5 , i 6 , i 7 , i 8 } , { i 9 , i 10 , i 11 , i 12 } , { i 13 , i 14 , i 15 , i 16 }} � = 16 log 2 1 / 4 t ∈ U speaker’s lexicon S = { C 1 → �� , C 2 → �� , C 3 → �� , C 4 → �� } 0000 0100 1000 1100 0001 0101 1001 1101 = 16( 1 1 Ideal system: 4-bit signals 0010 0110 1010 1110 16 log 2 1 / 4) listener’s lexicon 0011 0111 1011 1111 L = { �� → C 1 , �� → C 2 , �� → C 3 , �� → C 4 } (1 signal for every meaning) 1 p = [ 1 16 , 1 16 , 1 16 , 1 16 , 1 16 , 1 16 , 1 16 , 1 16 , 1 16 , 1 16 , 1 16 , 1 16 , 1 16 , 1 16 , 1 16 , 1 = log 2 need probabilities 16] 1 / 4 Actual system: 2-bit signals 00 01 10 11 (Pressure from leaning prefers more compressed system) = log 2 4 speaker distributions s 1 = [1 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0] (for each meaning) s 2 = [0 , 1 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0] = 2 bits ... Loss of information on every communicative episode: s 16 = [0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 1] 4 bits – 2 bits = 2 bits l C 1 = [ 1 4 , 1 4 , 1 4 , 1 listener distributions 4 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0] (for each category) l C 2 = [0 , 0 , 0 , 0 , 1 4 , 1 4 , 1 4 , 1 4 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0] l C 3 = [0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 1 4 , 1 4 , 1 4 , 1 4 , 0 , 0 , 0 , 0] l C 4 = [0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 1 4 , 1 4 , 1 4 , 1 4 ]

Communicative cost: Listener distributions Humans aren’t discrete categorizers; in human cognition, we see two effects: (a) within-category prototypicality (b) across-category fuzziness Instead, the listener   distributions can be   Discrete categorizer Fuzzy categorizer Non-categorizer modelled as Gaussians: � e γ d ( i,j ) l C ( i ) ∝ j ∈ U where γ allows you to model various types of categorizer 1—4 5—8 9—12 13—16 1—4 5—8 9—12 13—16 1—4 5—8 9—12 13—16

Communicative cost: Example of a fuzzy categorizer 1 � universe U = { i 1 , i 2 , ..., i 16 } k = p ( t ) log 2 l ( t ) t ∈ U category partition P = { C 1 , C 2 , C 3 , C 4 } = 3 . 636 bits = {{ i 1 , i 2 , i 3 , i 4 } , { i 5 , i 6 , i 7 , i 8 } , { i 9 , i 10 , i 11 , i 12 } , { i 13 , i 14 , i 15 , i 16 }} speaker’s lexicon S = { C 1 → �� , C 2 → �� , C 3 → �� , C 4 → �� } listener’s lexicon L = { �� → C 1 , �� → C 2 , �� → C 3 , �� → C 4 } p = [ 1 16 , 1 16 , 1 16 , 1 16 , 1 16 , 1 16 , 1 16 , 1 16 , 1 16 , 1 16 , 1 16 , 1 16 , 1 16 , 1 16 , 1 16 , 1 need probabilities 16] speaker distributions s 1 = [1 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0] (for each meaning) s 2 = [0 , 1 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0] ... s 16 = [0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 1] l C 1 = [ . 079 , . 082 , . 082 , . 079 , . 071 , . 064 , . 058 , . 053 , . 048 , . 045 , . 045 , . 048 , . 053 , . 058 , . 064 , . 071] listener distributions (for each category) l C 2 = [ . 053 , . 058 , . 064 , . 071 , . 079 , . 082 , . 082 , . 079 , . 071 , . 064 , . 058 , . 053 , . 048 , . 045 , . 045 , . 048] l C 3 = [ . 048 , . 045 , . 045 , . 048 , . 053 , . 058 , . 064 , . 071 , . 079 , . 082 , . 082 , . 079 , . 071 , . 064 , . 058 , . 053] l C 4 = [ . 071 , . 064 , . 058 , . 053 , . 048 , . 045 , . 045 , . 048 , . 053 , . 058 , . 064 , . 071 , . 079 , . 082 , . 082 , . 079]

Communicative cost: Six predictions Expressivity A system of many categories is more informative Convexity A system of convex categories (blue) is more than a system of few categories informative than a system of nonconvex categories (red) Balanced categories A system of equally sized categories is more Discreteness A system of discrete categories is more informative informative than a system of unequally sized categories than a system of fuzzy categories Dimensionality A system that uses many dimensions is less (?) Compactness A system of compact categories is more informative than a system that uses few dimensions informative than a system of noncompact categories

Informativeness: A review of work by Regier and colleagues (and a - PowerPoint PPT Presentation

Informativeness: A review of work by Regier and colleagues (and a response) Jon W. Carr Centre for Language Evolution School of Philosophy, Psychology and Language Sciences University of Edinburgh What shapes language? compressibility

Induction and interaction in the evolution of language and conceptual structure Jon W. Carr

Resolving Quantity and Informativeness Implicature in Indefinite Reference Till Poppels and

Information Theory and Feature Selection (Joint Informativeness and Tractability) Leonidas

Multiple Blockholders, Price Informativeness, and Firm Value Alex Edmans, Wharton Gustavo Manso,

Simplicity and informativeness in the cultural evolution of language Jon W. Carr, Kenny Smith,

summaries and its correlation with the informativeness Andressa Zacarias, Vernica Agostini,

The Informativeness of Discretionary LLPs during the Financial Crisis FDIC/JFSR Conference in

Physician to Physician: Engaging Your Colleagues in Surgical Safety Todays Topics

Stochastic Cubic Regularization for Fast Nonconvex Optimization Nilesh Tripuraneni, Mitchell

FE Review-Transportation 1 FE Review-Transportation 2 FE Review-Transportation 3 FE

Hastings Opportunity Area: Initial work was undertaken with Hastings schools, colleagues,

FE Review-Mechanics of Materials 1 FE Review-Mechanics of Materials 2 FE Review-Mechanics of

MTA-RF: Fabrication Readiness Review Bowring Review Daniel Bowring Lawrence Berkeley National

Keeyask Engineering Review Jan 30 2017 Project Design Review Contract Cost Review

Belief models A very general theory of aggregation Seamus Bradley University of Leeds June 20,

Belief models A very general theory of aggregation Seamus Bradley University of Leeds May 14,

Oracle Nested T ables Another structuring to ol pro vided in Oracle is the abilit y

COMP331/COMP557: Optimisation Martin Gairing Computer Science Department University of Liverpool

Canonical Correlation a Tutorial Magnus Borga January 12, 2001 Contents 1 About this tutorial

Why NLP Needs Theoretical Syntax (It in Fact Already Uses It) Owen Rambow Center for

Linear Programming Illustration Courtesy: Kevin Wayne & Denis Pankratov 373F19 - Nisarg Shah

Slides on the IT- Slides on the IT- CDA Service CDA Service Documentation Documentation

Divided America: Politics and Polarization Direction of Country Right Direction Wrong Track

A Brief and Friendly Introduction to Computational Psycholinguistics Roger Levy UC San Diego

Informativeness: A review of work by Regier and colleagues (and a - PowerPoint PPT Presentation

Informativeness: A review of work by Regier and colleagues (and a response) Jon W. Carr Centre for Language Evolution School of Philosophy, Psychology and Language Sciences University of Edinburgh What shapes language? compressibility

Induction and interaction in the evolution of language and conceptual structure Jon W. Carr

Resolving Quantity and Informativeness Implicature in Indefinite Reference Till Poppels and

Information Theory and Feature Selection (Joint Informativeness and Tractability) Leonidas

Multiple Blockholders, Price Informativeness, and Firm Value Alex Edmans, Wharton Gustavo Manso,

Simplicity and informativeness in the cultural evolution of language Jon W. Carr, Kenny Smith,

summaries and its correlation with the informativeness Andressa Zacarias, Vernica Agostini,

The Informativeness of Discretionary LLPs during the Financial Crisis FDIC/JFSR Conference in

Physician to Physician: Engaging Your Colleagues in Surgical Safety Todays Topics

Stochastic Cubic Regularization for Fast Nonconvex Optimization Nilesh Tripuraneni, Mitchell

FE Review-Transportation 1 FE Review-Transportation 2 FE Review-Transportation 3 FE

Hastings Opportunity Area: Initial work was undertaken with Hastings schools, colleagues,

FE Review-Mechanics of Materials 1 FE Review-Mechanics of Materials 2 FE Review-Mechanics of

MTA-RF: Fabrication Readiness Review Bowring Review Daniel Bowring Lawrence Berkeley National

Keeyask Engineering Review Jan 30 2017 Project Design Review Contract Cost Review

Belief models A very general theory of aggregation Seamus Bradley University of Leeds June 20,

Belief models A very general theory of aggregation Seamus Bradley University of Leeds May 14,

Oracle Nested T ables Another structuring to ol pro vided in Oracle is the abilit y

COMP331/COMP557: Optimisation Martin Gairing Computer Science Department University of Liverpool

Canonical Correlation a Tutorial Magnus Borga January 12, 2001 Contents 1 About this tutorial

Why NLP Needs Theoretical Syntax (It in Fact Already Uses It) Owen Rambow Center for

Linear Programming Illustration Courtesy: Kevin Wayne &amp; Denis Pankratov 373F19 - Nisarg Shah

Slides on the IT- Slides on the IT- CDA Service CDA Service Documentation Documentation

Divided America: Politics and Polarization Direction of Country Right Direction Wrong Track

A Brief and Friendly Introduction to Computational Psycholinguistics Roger Levy UC San Diego

Linear Programming Illustration Courtesy: Kevin Wayne & Denis Pankratov 373F19 - Nisarg Shah