Communicating generalizations (in computational terms) Michael - - PowerPoint PPT Presentation

communicating generalizations in computational terms
SMART_READER_LITE
LIVE PREVIEW

Communicating generalizations (in computational terms) Michael - - PowerPoint PPT Presentation

Communicating generalizations (in computational terms) Michael Henry Tessler Stanford University A Generic Workshop (CSLI) May 20, 2017 What do generalizations in language mean? Dogs bark. Metric: P ( F | K ) = prevalence Some dogs bark.


slide-1
SLIDE 1

Communicating generalizations (in computational terms)

Michael Henry Tessler Stanford University

A Generic Workshop (CSLI) May 20, 2017

slide-2
SLIDE 2

What do generalizations in language mean?

slide-3
SLIDE 3

Dogs bark.

slide-4
SLIDE 4

Some dogs bark. Most dogs bark. All dogs bark. Dogs bark.

= prevalence Metric: P(F | K) [[Some]] := {P(F | K) > 0} [[Most]] := {P(F | K) & 0.5} [[All]] := {P(F | K) = 1} [[Generic]] := {P(F | K) > θ}

slide-5
SLIDE 5

Robins lay eggs. Carlson (1977), Leslie (2008) Robins are female. prevalence = P(lays eggs | robin) ≈ P(is female | robin)

slide-6
SLIDE 6

Robins lay eggs. Carlson (1977), Leslie (2008) Robins are female. Mosquitos carry malaria.

slide-7
SLIDE 7

30 generic sentences covering different
 “conceptual distinctions” (Prasada et al., 2013) n = 100 from Amazon’s Mechanical Turk Two-alternative forced choice Endorsement task

slide-8
SLIDE 8

n = 100 from MTurk

slide-9
SLIDE 9

Leopards have wings. 0.5 1 Kangaroos have pouches. Mosquitos carry malaria. Ticks carry Lyme disease.

Human judgment

Agree Disagree

slide-10
SLIDE 10

Leopards have wings. Lions lay eggs. Peacocks dont have beautiful feathers. Tigers have pouches. Sharks have manes. 0.5 1 Kangaroos have pouches. Mosquitos carry malaria. Ticks carry Lyme disease. Cardinals are red. Peacocks have beautiful feathers. Mosquitos dont carry malaria. Sharks lay eggs. Leopards are juvenile. Sharks dont attack swimmers. Tigers dont eat people. Sharks are white. Mosquitos attack swimmers. Robins are female. Lions are male. Tigers eat people. Swans are full−grown. Sharks attack swimmers. Swans are white. Leopards have spots. Lions have manes. Robins lay eggs.

Human judgment Agree Disagree

slide-11
SLIDE 11

[[Generic]] := {P(F | K) > θ}

slide-12
SLIDE 12

n = 57 from Amazon’s Mechanical Turk —> Rate % of animal with property (e.g., % of robins that lay eggs) Prevalence Elicitation Task

slide-13
SLIDE 13

Null hypothesis Raw frequency explains truth judgments

r2(30) = 0.59

  • 0.0

0.5 1.0 0.0 0.5 1.0

% of category with property Human endorsement

0.0 0.5 1.0 v

Prevalence

Leopards have spots. Lions have wings. Robins lay eggs. Robins are female. Sharks don’t eat people. Mosquitos carry malaria.

slide-14
SLIDE 14

Statistics (with a hard semantics) is insufficient

Leopards have wings. Lions lay eggs. Peacocks dont have beautiful feathers. Tigers have pouches. Sharks have manes. 0.5 1 Kangaroos have pouches. Mosquitos carry malaria. Ticks carry Lyme disease. Cardinals are red. Peacocks have beautiful feathers. Mosquitos dont carry malaria. Sharks lay eggs. Leopards are juvenile. Sharks dont attack swimmers. Tigers dont eat people. Sharks are white. Mosquitos attack swimmers. Robins are female. Lions are male. Tigers eat people. Swans are full−grown. Sharks attack swimmers. Swans are white. Leopards have spots. Lions have manes.

slide-15
SLIDE 15

– Nickel, B., 2016, p.8

“A theory of generics should smoothly integrate with a more comprehensive semantic theory for a natural language.”

slide-16
SLIDE 16

“A theory of generics should smoothly integrate with a more comprehensive semantic (and pragmatic) theory for a natural language.”

slide-17
SLIDE 17

“Last night, we had to wait a million years to get a table”

slide-18
SLIDE 18

“One of my avowed aims is to see talking as a special case or variety

  • f purposive, indeed

rational, behavior …” Grice (1975) An assumption of cooperativity in 
 language understanding

slide-19
SLIDE 19

Rational Speech Act

  • Bayesian cognitive model, 


understands language pragmatically

  • Many rich phenomena formalized
  • Hyperbole (Kao et al., 2014)
  • Indirect answers to questions 


(Hawkins et al., 2015)

  • Politeness (*Yoon, *Tessler, et al., 2016)

For a review, see Goodman & Frank (2016) Trends in Cognitive Science

slide-20
SLIDE 20

Can this formal, pragmatics model understand generalizations in language?

slide-21
SLIDE 21

Dogs bark. What do generalizations in language mean? ≈ P(bark | dog) > θ

slide-22
SLIDE 22

Dogs bark. call this probability: h P(ugen | h) ∝ ( 1 if h > θ

  • therwise

≈ P(bark | dog) > θ

slide-23
SLIDE 23

Some dogs bark. Most dogs bark. All dogs bark. Dogs bark.

= prevalence Metric: P(F | K) [[Some]] := {P(F | K) > 0} [[Most]] := {P(F | K) & 0.5} [[All]] := {P(F | K) = 1} [[Generic]] := {P(F | K) > θ}

slide-24
SLIDE 24

What should be? θ

P(θ) = Uniform(0,1)

cf., Sterken (2015)

slide-25
SLIDE 25

P(θ) = Uniform(0,1)

P(h) world knowledge semantics PL(h, θ | ugen) ∝ P(h) · P(θ) · P(ugen | h) P(ugen | h) ∝ ( 1 if h > θ

  • therwise

Simple but underspecified Tessler & Goodman (arXiv, in revision)

slide-26
SLIDE 26

PL(h, θ | ugen) ∝ P(h) · P(θ) · P(ugen | h) Interpretation model: Given a generalization, what is h? Listener “dogs bark”

θ

p(F | K)

world knowledge

slide-27
SLIDE 27

PL(h, θ | ugen) ∝ P(h) · P(θ) · P(ugen | h) Listener Speaker PS(ugen | h) ∝ P(u) · Z

θ

PL(h, θ | ugen) “dogs bark”

θ

slide-28
SLIDE 28

PL(h, θ | ugen) ∝ P(h) · P(θ) · P(ugen | h) Listener Speaker PS(ugen | h) ∝ P(u) · Z

θ

PL(h, θ | ugen) utterance prior P(u) = UniformDraw(ugen, silence) “dogs bark”

θ

slide-29
SLIDE 29

PL(h, θ | ugen) ∝ P(h) · P(θ) · P(ugen | h) Listener Speaker PS(ugen | h) ∝ P(u) · Z

θ

PL(h, θ | ugen) Endorsement model: Given an h, do you say the generalization (vs. not)? “dogs bark”

θ

slide-30
SLIDE 30

Defining h h = P(x ∈ F | x ∈ K) prevalence, frequency, propensity, subjective probability, … For some recent hypotheses, see Icard et al. (2017)

slide-31
SLIDE 31

Case studies of genericity Categories
 (generics) Events
 (habituals) Causes
 (causals) Example Dogs bark John smokes Drinking moonshine makes you go blind Category K DOG JOHN DRINKING MOONSHINE Property F barks is smoking caused person to go blind Alternative Ks Other animals Other people Other possible causes Prior on ………. Measured Measured Manipulated Target ……… Measured Manipulated Manipulated

Overview

p(f | k) p(f | k)

slide-32
SLIDE 32

Overview

Case studies of genericity Categories
 (generics) Events
 (habituals) Causes
 (causals) Example Dogs bark John smokes Drinking moonshine makes you go blind Category K DOG JOHN DRINKING MOONSHINE Property F barks is smoking caused person to go blind Alternative Ks Other animals Other people Other possible causes Prior on ………. Measured Measured Manipulated Target ……… Measured Manipulated Manipulated

p(f | k) p(f | k)

slide-33
SLIDE 33

PL(h, θ | ugen) ∝ P(h) · P(θ) · P(ugen | h) PS(ugen | h) ∝ P(u) · Z

θ

PL(h, θ | ugen)

slide-34
SLIDE 34

What’s your favorite animal? What % lays eggs? Beliefs about probabilities What % is female?

slide-35
SLIDE 35

% lays eggs % carries malaria

Beliefs about probabilities

slide-36
SLIDE 36

Case studies of genericity Categories
 (generics) Events
 (habituals) Causes
 (causals) Example Dogs bark John smokes Drinking moonshine makes you go blind Category K DOG JOHN DRINKING MOONSHINE Property F barks is smoking caused person to go blind Alternative Ks Other animals Other people Other possible causes Prior on ………. Measured Measured Manipulated Target ……… Measured Manipulated Manipulated

Overview

p(f | k) p(f | k)

slide-37
SLIDE 37

Null hypothesis Raw frequency explains truth judgments

r2(30) = 0.59

  • 0.0

0.5 1.0 0.0 0.5 1.0

% of category with property Human endorsement

0.0 0.5 1.0 v

Prevalence

slide-38
SLIDE 38

n = 60 21 properties in total

  • 1. Generate animal names
  • 2. Rate %, for each property

Experiment 1a: 
 Prevalence prior elicitation

slide-39
SLIDE 39

n = 60 from Amazon’s Mechanical Turk 21 properties in total Prior experiment

generated by participants

slide-40
SLIDE 40

are white

1 2 3

0.0 0.5 1.0 p(F|K) Probability density carry malaria

1 2 3 4

0.0 0.5 1.0 p(F|K) Probability density are full−grown

0.0 0.5 1.0 1.5

0.0 0.5 1.0 p(F|K) Probability density

Results

are female

0.0 2.5 5.0 7.5 10.0

0.0 0.5 1.0 p(F|K) Probability density have wings

0.00 0.25 0.50 0.75 1.00

0.0 0.5 1.0 p(F|K) Probability density lay eggs

0.00 0.25 0.50 0.75 1.00

0.0 0.5 1.0 p(F|K) Probability density

have wings are full-grown are white carry malaria are female lay eggs

slide-41
SLIDE 41

Null hypothesis Raw frequency explains truth judgments

r2(30) = 0.59

  • 0.0

0.5 1.0 0.0 0.5 1.0

% of category with property Human endorsement

Null hypothesis 2 Frequency + “Cue validity”
 [i.e., P(F|K) + P(K|F) ]

r2(30) = 0.79

  • 0.0

0.5 1.0 0.0 0.5 1.0

Prevalence + Cue Validity model Human endorsement

0.0 0.5 1.0 v

Prevalence

Mosquitos carry malaria. Lions have manes. Mosquitos don’t carry malaria. Robins lay eggs.

slide-42
SLIDE 42

Null hypothesis Raw frequency

r2(30) = 0.59

  • 0.0

0.5 1.0 0.0 0.5 1.0

% of category with property Human endorsement

Null hypothesis 2 Frequency + “Cue validity”
 [i.e., P(F|K) + P(K|F) ]

r2(30) = 0.79

  • 0.0

0.5 1.0 0.0 0.5 1.0

Prevalence + Cue Validity model Human endorsement

0.0 0.5 1.0 v

Prevalence

Pragmatics model Uncertain threshold + world knowledge

r2(30) = 0.98

  • 0.0

0.5 1.0 0.0 0.5 1.0

Model prediction Human endorsement

slide-43
SLIDE 43
  • 0.0

0.5 1.0 0.0 0.5 1.0

% of category with property Human endorsement

  • 0.0

0.5 1.0 0.0 0.5 1.0

Model prediction Human endorsement

carry malaria

0.0 0.5 1.0 0.0 0.5 1.0

Prevalence Scaled density

mosquitos

dont attack swimmers

0.0 0.5 1.0 0.0 0.5 1.0

Prevalence Scaled density

sharks

are female

0.0 0.5 1.0 0.0 0.5 1.0

Prevalence Scaled density

robins

lay eggs

0.0 0.5 1.0 0.0 0.5 1.0

Prevalence Scaled density

robins

Listener Posterior Listener Prior

P(h) = PL(h | silence)

PL(h | generic)

slide-44
SLIDE 44
  • A model of a simple, underspecified semantics, interpreted by

a cognitive agent knows when to endorse generics

  • Probability (e.g., prevalence) is sufficient to formalize the

semantics of generics

  • Prior beliefs and pragmatic principles combine to resolve

meaning, in context

  • 0.0

0.5 1.0 0.0 0.5 1.0

Model prediction Human endorsement

slide-45
SLIDE 45

Exploring the prevalence prior

slide-46
SLIDE 46

Priors on prevalence are structured. “Null component” could reflect accidental or transient causes

φ

D1

D0

“Positive component” could reflect stable causes (c.f., Gelman, 2004; but also Vasilyeva, N. [later today])

h

slide-47
SLIDE 47

Case studies of genericity Categories
 (generics) Events
 (habituals) Causes
 (causals) Example Dogs bark John smokes Drinking moonshine makes you go blind Category K DOG JOHN DRINKING MOONSHINE Property F barks is smoking caused person to go blind Alternative Ks Other animals Other people Other possible causes Prior on ………. Measured Measured Manipulated Target ……… Measured Manipulated Manipulated

Overview

p(f | k) p(f | k)

slide-48
SLIDE 48

Materials: 31 actions from 5 categories Food & drug: e.g., smokes cigarettes eats peanut butter Clothing: e.g., wears a watch Work: e.g., sells things on eBay Entertainment: e.g., watches professional football Hobbies: e.g., runs, hikes

Case study 2: Endorsing habituals

slide-49
SLIDE 49

n = 150 (~50 per item) Experimental design 36 trials / participant 93 unique event::frequency pairs week, month, …

slide-50
SLIDE 50

Null model Raw frequency explains truth judgments

  • 0.0

0.5 1.0 2 4 6

Frequency of event (log scale) Proportion human endorsement

3 / month 3 / year 3 / 5 years 3 / week

slide-51
SLIDE 51

Null model Raw frequency explains truth judgments

  • 0.0

0.5 1.0 2 4 6

Frequency of event (log scale) Proportion human endorsement

climbs mountains hikes runs smokes cigarettes goes to the movies 3 / week 3 / 5 years Log Frequency

1 7

r2(93) = 0.33

3 / month 3 / year

r2(50) = 0.07 event frequency < 1 / year ::

slide-52
SLIDE 52

Null model Raw frequency explains truth judgments

r2(93) = 0.33

  • 0.0

0.5 1.0 2 4 6

Frequency of event (log scale) Proportion human endorsement

3 / week 3 / five years

  • ●●
  • 0.0

0.5 1.0 0.0 0.5 1.0

Model prediction Proportion human endorsement

Pragmatics model Uncertain threshold + world knowledge

3 / week 3 / five years Log Frequency

1 7

r2(93) = 0.94

slide-53
SLIDE 53
  • ●●
  • 0.0

0.5 1.0 0.0 0.5 1.0

Model prediction Proportion human endorsement

runs

0.0 0.5 1.0 2 4 6 8

Log frequency Scaled density

  • 0.0

0.5 1.0 2 4 6

Frequency of event (log scale) Proportion human endorsement

hikes

0.0 0.5 1.0 2 4 6 8

Log frequency Scaled density climbs mountains

0.0 0.5 1.0 2 4 6 8

Log frequency Scaled density

3 / year Log Frequency

P(h) = PL(h | silence) PL(h | habitual)

Listener Posterior Listener Prior

slide-54
SLIDE 54

Hypothesis A: Speaker communicates past frequency Mary smokes. Hypothesis B: Speaker communicates future predictions

Habituals follow-up: 
 What is h?

PS(ugen | h) ∝ P(u) · Z

θ

PL(h, θ | ugen)

slide-55
SLIDE 55

Causal manipulation statement:

  • 1. Preventative e.g. “Yesterday, Mary quit smoking.”
  • 2. Enabling e.g.“Yesterday, Mary bought a pack of cigarettes.”
  • 3. No additional information (same as before)

“In the next WEEK, how many times do you think Mary will smoke cigarettes?” Experimental design Experiment 2a: Prediction elicitation (n=120) Experiment 2b: Endorsement task (n=150) “Mary smokes cigarettes.”

slide-56
SLIDE 56

Prediction elicitation / manipulation check

2 4 6 2 4 6

Past (log) frequency Predicted (log) frequency Proportion human endorsement

condition baseline enabling preventative

slide-57
SLIDE 57

Endorsements

0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00

Model posterior predictive Proportion human endorsement

condition baseline enabling preventative

  • 0.00

0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00

Model posterior predictive Proportion human endorsement

PS(ugen | past frequency) PS(ugen | predictive frequency)

slide-58
SLIDE 58

Case studies of genericity Categories
 (generics) Events
 (habituals) Causes
 (causals) Example Dogs bark John smokes Drinking moonshine makes you go blind Category K DOG JOHN DRINKING MOONSHINE Property F barks is smoking caused person to go blind Alternative Ks Other animals Other people Other possible causes Prior on ………. Measured Measured Manipulated Target ……… Measured Manipulated Manipulated

Overview

p(f | k) p(f | k)

slide-59
SLIDE 59
  • Semi-familiar domains:

  • herbs making animals sleepy

  • fertilizers making plants grow tall
  • Manipulate distributions
  • Manipulate frequencies
  • 1 trial task (about 2 minutes)

Case study 3: 
 Generalizations about causes

slide-60
SLIDE 60

Cover story

slide-61
SLIDE 61

Cover story

slide-62
SLIDE 62

Cover story

slide-63
SLIDE 63

Cover story

slide-64
SLIDE 64

Cover story

slide-65
SLIDE 65
slide-66
SLIDE 66

Experimental manipulations

Common Weak Rare Weak Common Deterministic Rare Deterministic 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0

Target probability Scaled Prior Probability

slide-67
SLIDE 67

Endorsement task

slide-68
SLIDE 68

Experimental manipulations

Common Weak Rare Weak Common Deterministic Rare Deterministic 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0

Target probability Scaled Prior Probability

20% 70%

  • 8 unique conditions: n = 400 (~50 / condition)
slide-69
SLIDE 69

error bars 95% bayesian credible interval

0.0 0.5 1.0 20 70

Frequency Proportion Endorsement Distribution

Common Deterministic Rare Deterministic Common Weak Rare Weak

Common Weak Rare Weak Common Deterministic Rare Deterministic 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0

Target probability Scaled Prior Probability

slide-70
SLIDE 70

Case studies of genericity Categories
 (generics) Events
 (habituals) Causes
 (causals) Example Dogs bark John smokes Drinking moonshine makes you go blind Category K DOG JOHN DRINKING MOONSHINE Property F barks is smoking caused person to go blind Alternative Ks Other animals Other people Other possible causes Prior on ………. Measured Measured Manipulated Target ……… Measured Manipulated Manipulated

Overview

p(f | k) p(f | k)

slide-71
SLIDE 71

Summary

  • Introduced an underspecified threshold-

semantics for modeling genericity

  • Measured and manipulated target probability and

the prevalence prior show effects on endorsements, as predicted by the model

  • Formalized in a Bayesian model, cleanly

separates (a) world knowledge from (b) semantics; & in a general framework for communication (Rational Speech Act)

slide-72
SLIDE 72

New directions

slide-73
SLIDE 73

h = P(x ∈ f | x ∈ k)

P(θ) = Uniform(0,1)

underspecified, truth-functional semantics PL(h, θ | ugen) ∝ P(h) · P(θ) · P(ugen | h) P(ugen | h) ∝ ( 1 if h > θ

  • therwise

always with respect to a comparison class

slide-74
SLIDE 74

New directions

  • Comparison Class: Not well understood 


(though see Tessler, Lopez-Brau, & Goodman, 2017 CogSci)

slide-75
SLIDE 75

h = P(x ∈ f | x ∈ k)

P(θ) = Uniform(0,1)

underspecified, truth-functional semantics PL(h, θ | ugen) ∝ P(h) · P(θ) · P(ugen | h) P(ugen | h) ∝ ( 1 if h > θ

  • therwise

= P(h) · Psoft(ugen | h) Psoft(ugen | h) ∝ h soft semantics

slide-76
SLIDE 76

New directions

  • Comparison Class: Not well understood 


(though see Tessler, Lopez-Brau, & Goodman, 2017 CogSci)

  • Acquisition: Is learning a soft-semantics easier?
slide-77
SLIDE 77

Thank you

Funding: National Science Foundation GRFP Noah Goodman

slide-78
SLIDE 78

Summary

  • Introduced an underspecified threshold-

semantics for modeling genericity

  • Measured and manipulated target probability and

the prevalence prior show effects on endorsements, as predicted by the model

  • Formalized in a Bayesian model, cleanly

separates (a) world knowledge from (b) semantics; & in a general framework for communication (Rational Speech Act)