Communicating generalizations (in computational terms)
Michael Henry Tessler Stanford University
A Generic Workshop (CSLI) May 20, 2017
Communicating generalizations (in computational terms) Michael - - PowerPoint PPT Presentation
Communicating generalizations (in computational terms) Michael Henry Tessler Stanford University A Generic Workshop (CSLI) May 20, 2017 What do generalizations in language mean? Dogs bark. Metric: P ( F | K ) = prevalence Some dogs bark.
Michael Henry Tessler Stanford University
A Generic Workshop (CSLI) May 20, 2017
What do generalizations in language mean?
Dogs bark.
Some dogs bark. Most dogs bark. All dogs bark. Dogs bark.
= prevalence Metric: P(F | K) [[Some]] := {P(F | K) > 0} [[Most]] := {P(F | K) & 0.5} [[All]] := {P(F | K) = 1} [[Generic]] := {P(F | K) > θ}
Robins lay eggs. Carlson (1977), Leslie (2008) Robins are female. prevalence = P(lays eggs | robin) ≈ P(is female | robin)
Robins lay eggs. Carlson (1977), Leslie (2008) Robins are female. Mosquitos carry malaria.
30 generic sentences covering different “conceptual distinctions” (Prasada et al., 2013) n = 100 from Amazon’s Mechanical Turk Two-alternative forced choice Endorsement task
n = 100 from MTurk
Leopards have wings. 0.5 1 Kangaroos have pouches. Mosquitos carry malaria. Ticks carry Lyme disease.
Human judgment
Agree Disagree
Leopards have wings. Lions lay eggs. Peacocks dont have beautiful feathers. Tigers have pouches. Sharks have manes. 0.5 1 Kangaroos have pouches. Mosquitos carry malaria. Ticks carry Lyme disease. Cardinals are red. Peacocks have beautiful feathers. Mosquitos dont carry malaria. Sharks lay eggs. Leopards are juvenile. Sharks dont attack swimmers. Tigers dont eat people. Sharks are white. Mosquitos attack swimmers. Robins are female. Lions are male. Tigers eat people. Swans are full−grown. Sharks attack swimmers. Swans are white. Leopards have spots. Lions have manes. Robins lay eggs.
Human judgment Agree Disagree
[[Generic]] := {P(F | K) > θ}
n = 57 from Amazon’s Mechanical Turk —> Rate % of animal with property (e.g., % of robins that lay eggs) Prevalence Elicitation Task
Null hypothesis Raw frequency explains truth judgments
0.5 1.0 0.0 0.5 1.0
% of category with property Human endorsement
0.0 0.5 1.0 v
Prevalence
Leopards have spots. Lions have wings. Robins lay eggs. Robins are female. Sharks don’t eat people. Mosquitos carry malaria.
Statistics (with a hard semantics) is insufficient
Leopards have wings. Lions lay eggs. Peacocks dont have beautiful feathers. Tigers have pouches. Sharks have manes. 0.5 1 Kangaroos have pouches. Mosquitos carry malaria. Ticks carry Lyme disease. Cardinals are red. Peacocks have beautiful feathers. Mosquitos dont carry malaria. Sharks lay eggs. Leopards are juvenile. Sharks dont attack swimmers. Tigers dont eat people. Sharks are white. Mosquitos attack swimmers. Robins are female. Lions are male. Tigers eat people. Swans are full−grown. Sharks attack swimmers. Swans are white. Leopards have spots. Lions have manes.
– Nickel, B., 2016, p.8
“A theory of generics should smoothly integrate with a more comprehensive semantic theory for a natural language.”
“A theory of generics should smoothly integrate with a more comprehensive semantic (and pragmatic) theory for a natural language.”
“Last night, we had to wait a million years to get a table”
“One of my avowed aims is to see talking as a special case or variety
rational, behavior …” Grice (1975) An assumption of cooperativity in language understanding
understands language pragmatically
(Hawkins et al., 2015)
For a review, see Goodman & Frank (2016) Trends in Cognitive Science
Can this formal, pragmatics model understand generalizations in language?
Dogs bark. What do generalizations in language mean? ≈ P(bark | dog) > θ
Dogs bark. call this probability: h P(ugen | h) ∝ ( 1 if h > θ
≈ P(bark | dog) > θ
Some dogs bark. Most dogs bark. All dogs bark. Dogs bark.
= prevalence Metric: P(F | K) [[Some]] := {P(F | K) > 0} [[Most]] := {P(F | K) & 0.5} [[All]] := {P(F | K) = 1} [[Generic]] := {P(F | K) > θ}
What should be? θ
cf., Sterken (2015)
P(h) world knowledge semantics PL(h, θ | ugen) ∝ P(h) · P(θ) · P(ugen | h) P(ugen | h) ∝ ( 1 if h > θ
Simple but underspecified Tessler & Goodman (arXiv, in revision)
PL(h, θ | ugen) ∝ P(h) · P(θ) · P(ugen | h) Interpretation model: Given a generalization, what is h? Listener “dogs bark”
θ
p(F | K)
world knowledge
PL(h, θ | ugen) ∝ P(h) · P(θ) · P(ugen | h) Listener Speaker PS(ugen | h) ∝ P(u) · Z
θ
PL(h, θ | ugen) “dogs bark”
θ
PL(h, θ | ugen) ∝ P(h) · P(θ) · P(ugen | h) Listener Speaker PS(ugen | h) ∝ P(u) · Z
θ
PL(h, θ | ugen) utterance prior P(u) = UniformDraw(ugen, silence) “dogs bark”
θ
PL(h, θ | ugen) ∝ P(h) · P(θ) · P(ugen | h) Listener Speaker PS(ugen | h) ∝ P(u) · Z
θ
PL(h, θ | ugen) Endorsement model: Given an h, do you say the generalization (vs. not)? “dogs bark”
θ
Defining h h = P(x ∈ F | x ∈ K) prevalence, frequency, propensity, subjective probability, … For some recent hypotheses, see Icard et al. (2017)
Case studies of genericity Categories (generics) Events (habituals) Causes (causals) Example Dogs bark John smokes Drinking moonshine makes you go blind Category K DOG JOHN DRINKING MOONSHINE Property F barks is smoking caused person to go blind Alternative Ks Other animals Other people Other possible causes Prior on ………. Measured Measured Manipulated Target ……… Measured Manipulated Manipulated
p(f | k) p(f | k)
Case studies of genericity Categories (generics) Events (habituals) Causes (causals) Example Dogs bark John smokes Drinking moonshine makes you go blind Category K DOG JOHN DRINKING MOONSHINE Property F barks is smoking caused person to go blind Alternative Ks Other animals Other people Other possible causes Prior on ………. Measured Measured Manipulated Target ……… Measured Manipulated Manipulated
p(f | k) p(f | k)
PL(h, θ | ugen) ∝ P(h) · P(θ) · P(ugen | h) PS(ugen | h) ∝ P(u) · Z
θ
PL(h, θ | ugen)
What’s your favorite animal? What % lays eggs? Beliefs about probabilities What % is female?
% lays eggs % carries malaria
Beliefs about probabilities
Case studies of genericity Categories (generics) Events (habituals) Causes (causals) Example Dogs bark John smokes Drinking moonshine makes you go blind Category K DOG JOHN DRINKING MOONSHINE Property F barks is smoking caused person to go blind Alternative Ks Other animals Other people Other possible causes Prior on ………. Measured Measured Manipulated Target ……… Measured Manipulated Manipulated
p(f | k) p(f | k)
Null hypothesis Raw frequency explains truth judgments
r2(30) = 0.59
0.5 1.0 0.0 0.5 1.0
% of category with property Human endorsement
0.0 0.5 1.0 v
Prevalence
n = 60 21 properties in total
n = 60 from Amazon’s Mechanical Turk 21 properties in total Prior experiment
generated by participants
are white
1 2 3
0.0 0.5 1.0 p(F|K) Probability density carry malaria
1 2 3 4
0.0 0.5 1.0 p(F|K) Probability density are full−grown
0.0 0.5 1.0 1.5
0.0 0.5 1.0 p(F|K) Probability density
are female
0.0 2.5 5.0 7.5 10.0
0.0 0.5 1.0 p(F|K) Probability density have wings
0.00 0.25 0.50 0.75 1.00
0.0 0.5 1.0 p(F|K) Probability density lay eggs
0.00 0.25 0.50 0.75 1.00
0.0 0.5 1.0 p(F|K) Probability density
have wings are full-grown are white carry malaria are female lay eggs
Null hypothesis Raw frequency explains truth judgments
r2(30) = 0.59
0.5 1.0 0.0 0.5 1.0
% of category with property Human endorsement
Null hypothesis 2 Frequency + “Cue validity” [i.e., P(F|K) + P(K|F) ]
r2(30) = 0.79
0.5 1.0 0.0 0.5 1.0
Prevalence + Cue Validity model Human endorsement
0.0 0.5 1.0 v
Prevalence
Mosquitos carry malaria. Lions have manes. Mosquitos don’t carry malaria. Robins lay eggs.
Null hypothesis Raw frequency
r2(30) = 0.59
0.5 1.0 0.0 0.5 1.0
% of category with property Human endorsement
Null hypothesis 2 Frequency + “Cue validity” [i.e., P(F|K) + P(K|F) ]
r2(30) = 0.79
0.5 1.0 0.0 0.5 1.0
Prevalence + Cue Validity model Human endorsement
0.0 0.5 1.0 v
Prevalence
Pragmatics model Uncertain threshold + world knowledge
r2(30) = 0.98
0.5 1.0 0.0 0.5 1.0
Model prediction Human endorsement
0.5 1.0 0.0 0.5 1.0
% of category with property Human endorsement
0.5 1.0 0.0 0.5 1.0
Model prediction Human endorsement
carry malaria
0.0 0.5 1.0 0.0 0.5 1.0
Prevalence Scaled density
mosquitos
dont attack swimmers
0.0 0.5 1.0 0.0 0.5 1.0
Prevalence Scaled density
sharks
are female
0.0 0.5 1.0 0.0 0.5 1.0
Prevalence Scaled density
robins
lay eggs
0.0 0.5 1.0 0.0 0.5 1.0
Prevalence Scaled density
robins
Listener Posterior Listener Prior
P(h) = PL(h | silence)
PL(h | generic)
a cognitive agent knows when to endorse generics
semantics of generics
meaning, in context
0.5 1.0 0.0 0.5 1.0
Model prediction Human endorsement
Exploring the prevalence prior
Priors on prevalence are structured. “Null component” could reflect accidental or transient causes
“Positive component” could reflect stable causes (c.f., Gelman, 2004; but also Vasilyeva, N. [later today])
Case studies of genericity Categories (generics) Events (habituals) Causes (causals) Example Dogs bark John smokes Drinking moonshine makes you go blind Category K DOG JOHN DRINKING MOONSHINE Property F barks is smoking caused person to go blind Alternative Ks Other animals Other people Other possible causes Prior on ………. Measured Measured Manipulated Target ……… Measured Manipulated Manipulated
p(f | k) p(f | k)
Materials: 31 actions from 5 categories Food & drug: e.g., smokes cigarettes eats peanut butter Clothing: e.g., wears a watch Work: e.g., sells things on eBay Entertainment: e.g., watches professional football Hobbies: e.g., runs, hikes
n = 150 (~50 per item) Experimental design 36 trials / participant 93 unique event::frequency pairs week, month, …
Null model Raw frequency explains truth judgments
0.5 1.0 2 4 6
Frequency of event (log scale) Proportion human endorsement
3 / month 3 / year 3 / 5 years 3 / week
Null model Raw frequency explains truth judgments
0.5 1.0 2 4 6
Frequency of event (log scale) Proportion human endorsement
climbs mountains hikes runs smokes cigarettes goes to the movies 3 / week 3 / 5 years Log Frequency
1 7
r2(93) = 0.33
3 / month 3 / year
r2(50) = 0.07 event frequency < 1 / year ::
Null model Raw frequency explains truth judgments
r2(93) = 0.33
0.5 1.0 2 4 6
Frequency of event (log scale) Proportion human endorsement
3 / week 3 / five years
0.5 1.0 0.0 0.5 1.0
Model prediction Proportion human endorsement
Pragmatics model Uncertain threshold + world knowledge
3 / week 3 / five years Log Frequency
1 7
r2(93) = 0.94
0.5 1.0 0.0 0.5 1.0
Model prediction Proportion human endorsement
runs
0.0 0.5 1.0 2 4 6 8
Log frequency Scaled density
0.5 1.0 2 4 6
Frequency of event (log scale) Proportion human endorsement
hikes
0.0 0.5 1.0 2 4 6 8
Log frequency Scaled density climbs mountains
0.0 0.5 1.0 2 4 6 8
Log frequency Scaled density
3 / year Log Frequency
P(h) = PL(h | silence) PL(h | habitual)
Listener Posterior Listener Prior
Hypothesis A: Speaker communicates past frequency Mary smokes. Hypothesis B: Speaker communicates future predictions
PS(ugen | h) ∝ P(u) · Z
θ
PL(h, θ | ugen)
Causal manipulation statement:
“In the next WEEK, how many times do you think Mary will smoke cigarettes?” Experimental design Experiment 2a: Prediction elicitation (n=120) Experiment 2b: Endorsement task (n=150) “Mary smokes cigarettes.”
Prediction elicitation / manipulation check
2 4 6 2 4 6
Past (log) frequency Predicted (log) frequency Proportion human endorsement
condition baseline enabling preventative
Endorsements
0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00
Model posterior predictive Proportion human endorsement
condition baseline enabling preventative
0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00
Model posterior predictive Proportion human endorsement
PS(ugen | past frequency) PS(ugen | predictive frequency)
Case studies of genericity Categories (generics) Events (habituals) Causes (causals) Example Dogs bark John smokes Drinking moonshine makes you go blind Category K DOG JOHN DRINKING MOONSHINE Property F barks is smoking caused person to go blind Alternative Ks Other animals Other people Other possible causes Prior on ………. Measured Measured Manipulated Target ……… Measured Manipulated Manipulated
p(f | k) p(f | k)
Cover story
Cover story
Cover story
Cover story
Cover story
Common Weak Rare Weak Common Deterministic Rare Deterministic 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0
Target probability Scaled Prior Probability
Endorsement task
Common Weak Rare Weak Common Deterministic Rare Deterministic 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0
Target probability Scaled Prior Probability
20% 70%
error bars 95% bayesian credible interval
0.0 0.5 1.0 20 70
Frequency Proportion Endorsement Distribution
Common Deterministic Rare Deterministic Common Weak Rare Weak
Common Weak Rare Weak Common Deterministic Rare Deterministic 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0
Target probability Scaled Prior Probability
Case studies of genericity Categories (generics) Events (habituals) Causes (causals) Example Dogs bark John smokes Drinking moonshine makes you go blind Category K DOG JOHN DRINKING MOONSHINE Property F barks is smoking caused person to go blind Alternative Ks Other animals Other people Other possible causes Prior on ………. Measured Measured Manipulated Target ……… Measured Manipulated Manipulated
p(f | k) p(f | k)
semantics for modeling genericity
the prevalence prior show effects on endorsements, as predicted by the model
separates (a) world knowledge from (b) semantics; & in a general framework for communication (Rational Speech Act)
h = P(x ∈ f | x ∈ k)
underspecified, truth-functional semantics PL(h, θ | ugen) ∝ P(h) · P(θ) · P(ugen | h) P(ugen | h) ∝ ( 1 if h > θ
always with respect to a comparison class
(though see Tessler, Lopez-Brau, & Goodman, 2017 CogSci)
h = P(x ∈ f | x ∈ k)
underspecified, truth-functional semantics PL(h, θ | ugen) ∝ P(h) · P(θ) · P(ugen | h) P(ugen | h) ∝ ( 1 if h > θ
= P(h) · Psoft(ugen | h) Psoft(ugen | h) ∝ h soft semantics
(though see Tessler, Lopez-Brau, & Goodman, 2017 CogSci)
Thank you
Funding: National Science Foundation GRFP Noah Goodman
semantics for modeling genericity
the prevalence prior show effects on endorsements, as predicted by the model
separates (a) world knowledge from (b) semantics; & in a general framework for communication (Rational Speech Act)