[PPT] - Verb polysemy and frequency effects in thematic fit modeling PowerPoint Presentation

SLIDE 1

Polysemy and frequency in thematic fit

Verb polysemy and frequency effects in thematic fit modeling

Clayton Greenberg, Vera Demberg, and Asad Sayeed Saarland University / M2CI Cluster of Excellence June 4, 2015

SLIDE 2

Polysemy and frequency in thematic fit

McRae et al. (1998) thematic fit

1. The cop arrested…
2. The crook arrested…

2 ¡

SLIDE 3

Polysemy and frequency in thematic fit

McRae et al. (1998) thematic fit

1. The cop arrested the crook.
2. The crook arrested by the cop confessed.

3 ¡

SLIDE 4

Polysemy and frequency in thematic fit

McRae et al. (1998) procedure

Ø On a scale from 1 (very uncommon) to 7 (very common), how common is it for a

§ snake § nurse § monster § baby § cat

to frighten someone/something? Ø How common is it for a

§ snake § nurse § monster § baby § cat

to be frightened by someone/something?

4 ¡

SLIDE 5

Polysemy and frequency in thematic fit

Thematic fit datasets

5 ¡

Judgements ¡from ¡Padó ¡(2007) ¡

SLIDE 6

Polysemy and frequency in thematic fit

Challenges to judgement well-formedness

Alice played

6 ¡

croquet soccer piano cheese in the garden

SLIDE 7

Polysemy and frequency in thematic fit

Role-filler frequency

7 ¡

How common is it for croquet/soccer to be played? Google ngram (Michel et al., 2010) comparison of “croquet” and “soccer”

SLIDE 8

Polysemy and frequency in thematic fit

Polysemy

8 ¡

1 10 1e+02 1e+04 1e+06

Frequency WordNet.SynSets

How common is it for soccer/the piano to be played? Right: polysemy versus frequency of the most frequent verbs in COCA. Corpus obtained from Davies (2008).

SLIDE 9

Polysemy and frequency in thematic fit

Sense frequency

9 ¡

WordNet (Fellbaum, 1998) orders SynSets based on their frequencies play_1: participate in games or sport. "We played hockey all afternoon"; "play cards"; "Pele played for the Brazilian teams in many important matches” play_7: perform music on (a musical instrument). "He plays the flute"; "Can you play on this old recorder?"

SLIDE 10

Polysemy and frequency in thematic fit

Research question

How do

1. role-filler frequency
2. polysemy
3. sense frequency

affect thematic fit judgements?

10 ¡

SLIDE 11

Polysemy and frequency in thematic fit

Stimuli selection

McRae et al. (1998) Ø Many purposes Ø Verbs have “well-defined” roles Ø Role-fillers selected to fit their roles well Ø Animate role-fillers preferred Ø 146 verbs Ø 1,444 (F,R,V) triples Padó (2007) Ø One purpose Ø Verbs are most frequent in Penn Treebank & FrameNet Ø Role-fillers selected to have a wide range of fit ratings Ø Fully mixed animacy Ø 18 verbs Ø 414 (F,R,V) triples

11 ¡

SLIDE 12

Polysemy and frequency in thematic fit

New formulation of the question

How common is it for croquet to be played? Google ngram (Michel et al., 2010) comparison of “croquet” and “soccer”

12 ¡

SLIDE 13

Polysemy and frequency in thematic fit

New formulation of the question

13 ¡

Agreement scale: croquet is something that is played. Google ngram (Michel et al., 2010) comparison of “croquet” and “soccer”

SLIDE 14

Polysemy and frequency in thematic fit

Verb selection

Ø Start with 500,000 most common word forms in COCA. Ø Filter for verbs. Ø Lemmatize using the WordNet lemmatizer in NLTK (Bird et al., 2009). Ø Filter for only those that retrieve exactly one SynSet. Ø Sort by frequency. Ø Choose the first 48 that fit the paradigm (transitive, etc…). Ø For each MONOSEMOUS verb, find a POLYSEMOUS verb (at least 2 salient senses, ~7 SynSets) with similar unigram frequency.

14 ¡

SLIDE 15

Polysemy and frequency in thematic fit

Stimuli examples

Filler ¡type ¡ Frequency ¡ whip ¡(1686, ¡6 ¡SynSets) ¡ punish ¡(2908, ¡1 ¡SynSet) ¡ 15 ¡

To find a good patient-filler, query COCA for: VERB [at] [nn]

SLIDE 16

Polysemy and frequency in thematic fit

Stimuli examples

Filler ¡type ¡ Frequency ¡ whip ¡(1686, ¡6 ¡SynSets) ¡ punish ¡(2908, ¡1 ¡SynSet) ¡ Good ¡ horse ¡(32384) ¡

utlaw ¡(1487) ¡

16 ¡

Find a much higher or lower (~10x) frequency synonym.

SLIDE 17

Polysemy and frequency in thematic fit

Stimuli examples

Filler ¡type ¡ Frequency ¡ whip ¡(1686, ¡6 ¡SynSets) ¡ punish ¡(2908, ¡1 ¡SynSet) ¡ Good ¡ high ¡ horse ¡(32384) ¡ criminal ¡(9271) ¡ low ¡ stallion ¡(818) ¡

utlaw ¡(1487) ¡

17 ¡

For POLYSEMOUS verbs, repeat for second sense.

SLIDE 18

Polysemy and frequency in thematic fit

Stimuli examples

Filler ¡type ¡ Frequency ¡ whip ¡(1686, ¡6 ¡SynSets) ¡ punish ¡(2908, ¡1 ¡SynSet) ¡ Sense1 ¡ high ¡ horse ¡(32384) ¡ criminal ¡(9271) ¡ low ¡ stallion ¡(818) ¡

utlaw ¡(1487) ¡

Sense2 ¡ high ¡ cream ¡(19727) ¡ low ¡ frosDng ¡(905) ¡ 18 ¡

Randomly shuffle good patient-fillers to assign poor ones.

SLIDE 19

Polysemy and frequency in thematic fit

Stimuli examples

Filler ¡type ¡ Frequency ¡ whip ¡(1686, ¡6 ¡SynSets) ¡ punish ¡(2908, ¡1 ¡SynSet) ¡ Sense1 ¡ high ¡ horse ¡(32384) ¡ criminal ¡(9271) ¡ low ¡ stallion ¡(818) ¡

utlaw ¡(1487) ¡

Sense2 ¡ high ¡ cream ¡(19727) ¡ low ¡ frosDng ¡(905) ¡ Bad ¡ high ¡ party ¡(118292) ¡ criminal ¡(9271) ¡ low ¡ gathering ¡(7025) ¡

utlaw ¡(1487) ¡

19 ¡

Reshuffle all of the ones that are too good.

SLIDE 20

Polysemy and frequency in thematic fit

Stimuli examples

Filler ¡type ¡ Frequency ¡ whip ¡(1686, ¡6 ¡SynSets) ¡ punish ¡(2908, ¡1 ¡SynSet) ¡ Sense1 ¡ high ¡ horse ¡(32384) ¡ criminal ¡(9271) ¡ low ¡ stallion ¡(818) ¡

utlaw ¡(1487) ¡

Sense2 ¡ high ¡ cream ¡(19727) ¡ low ¡ frosDng ¡(905) ¡ Bad ¡ high ¡ party ¡(118292) ¡ baby ¡(70498) ¡ low ¡ gathering ¡(7025) ¡ fetus ¡(2329) ¡ 20 ¡

Filler items: the 240 most frequent triples from McRae et al. (1998)

SLIDE 21

Polysemy and frequency in thematic fit

Procedure

21 ¡

Ø Rewrite each verb in its past-participle form. Ø Normalize each role-filler to singular with appropriate determiner. Ø Choose either the +human or the –human template:

§ +human: is someone who is _ § –human: is something that is _

Ø One survey

§ 6 POLYSEMOUS, 4 MONOSEMOUS, 5 fillers § Workers do not see a verb in more than one condition § Compensation: $0.15 § 159 workers participated, 10 ratings per item.

SLIDE 22

Polysemy and frequency in thematic fit

ANOVA results: polysemy-fit interaction

22 ¡

SLIDE 23

Polysemy and frequency in thematic fit

Follow-up ANOVAs

Ø Good: polysemy (), frequency () Ø Bad: polysemy (), frequency ( ) Ø POLYSEMOUS: fit (), frequency ( . ) Ø MONOSEMOUS: fit (), frequency (*)

23 ¡

SLIDE 24

Polysemy and frequency in thematic fit

Comparing senses

24 ¡ More ¡frequent ¡sense ¡Less ¡frequent ¡sense ¡

SLIDE 25

Polysemy and frequency in thematic fit

Greenberg, Sayeed, and Demberg (2015)

25 ¡ spoon ¡ Verb ¡eat, ¡“with”-‑preposiDonal ¡object ¡ knife ¡ hand ¡ fork ¡ f r i e n d ¡ f a m i l y ¡ gusto ¡ cluster ¡3 ¡ centroid ¡ cluster ¡2 ¡ centroid ¡ cluster ¡1 ¡ centroid ¡

verall ¡ ¡

centroid ¡

SLIDE 26

Polysemy and frequency in thematic fit

Overall modelling results

Method ¡ Spearman’s ¡rho ¡(TypeDM), ¡range ¡= ¡[-‑1,1] ¡ Centroid ¡ 0.53 ¡ OneBest ¡ 0.54 ¡ kClusters ¡ 0.55 ¡

26 ¡

Correlation between our experimental human judgements and automatic scores using LMIs from TypeDM, by prototype generation method.

SLIDE 27

Polysemy and frequency in thematic fit

Modelling results by verb type

Method ¡ POLYSEMOUS ¡ MONOSEMOUS ¡ Centroid ¡ 0.41 ¡ 0.66 ¡ OneBest ¡ 0.45 ¡ 0.64 ¡ kClusters ¡ 0.43 ¡ 0.67 ¡

27 ¡

Correlation between our experimental human judgements and automatic scores using LMIs from TypeDM, by prototype generation method and verb type.

SLIDE 28

Polysemy and frequency in thematic fit

The MONOSEMOUS verb “obey”

1. injunction
2. will
3. wish
4. limit
5. equation
6. master
7. law, rule, commandment, principle, regulation, teaching,

convention

8. voice, word
9. order, command, instruction, call, summons

28 ¡

SLIDE 29

Polysemy and frequency in thematic fit

The POLYSEMOUS verb “observe”

1. day
2. silence
3. difference, change
4. object, star, bird
5. effect, phenomenon, pattern, behaviour, practice, behavior,

reaction, movement, trend

6. rule, custom, law, condition

29 ¡

SLIDE 30

Polysemy and frequency in thematic fit

Conclusions and future work

Ø Our dataset is available at: http://rollen.mmci.uni-saarland.de/ Ø It is the first thematic fit dataset to vary polysemy of verbs and frequency of role-fillers systematically. Ø We found that polysemy makes good role-fillers not as good and bad role-fillers not as bad. Ø The good role-fillers of a more frequent sense get higher ratings. Ø We verified the trends in Greenberg, Sayeed, and Demberg (2015). Ø Clustering prototypes navigates a trade-off between addressing polysemy and smoothing out noise. Ø The next step is a model that successfully integrates sense frequencies.

30 ¡

SLIDE 31

Polysemy and frequency in thematic fit

Thank you!

31 ¡

Data ¡from ¡this ¡project ¡available ¡at ¡hXp://rollen.mmci.uni-‑saarland.de/ ¡ ¡

SLIDE 32

Polysemy and frequency in thematic fit

References

Bird, S., Klein, E., and Loper, E. (2009). Natural Language Processing with Python. O'Reilly Media. Davies, M. (2008). The corpus of contemporary american english: 450 million words, 1990-present. Available online at http://corpus.byu.edu/coca/. Fellbaum, C. (1998). WordNet: an electronic lexical database. Wiley Online Library. Greenberg, C., Sayeed, A., and Demberg, V. (2015). Improving unsupervised vectorspace thematic fit evaluation via role-filler prototype clustering. In Proceedings of the 2015 conference of the North American chapter of the Association for Computational Linguistics - Human Language Technologies, Denver, USA. McRae, K., Spivey-Knowlton, M. J., and Tanenhaus, M. K. (1998). Modeling the influence

f thematic fit (and other constraints) in on-line sentence comprehension. Journal of

Memory and Language, 38(3):283-312. Michel, J., Shen, Y.K., Aiden, A.P., Veres, A., Gray, M.K., Brockman, W., The Google Books Team, Pickett, J.P., Hoiberg, D., Clancy, D., Norvig, P., Orwant, J., Pinker, S., Nowak, M.A., and Aiden, E.L. (2010) Quantitative Analysis of Culture Using Millions of Digitized Books. Science. Padó, U. (2007). The integration of syntax and semantic plausibility in a wide-coverage model of human sentence processing. PhD thesis, Saarland University.

Verb polysemy and frequency effects in thematic fit modeling

Clayton Greenberg, Vera Demberg, and Asad Sayeed Saarland University / M2CI Cluster of Excellence June 4, 2015

McRae et al. (1998) thematic fit

2 ¡

McRae et al. (1998) thematic fit

3 ¡

McRae et al. (1998) procedure

Ø On a scale from 1 (very uncommon) to 7 (very common), how common is it for a

§ snake § nurse § monster § baby § cat

to frighten someone/something? Ø How common is it for a

§ snake § nurse § monster § baby § cat

to be frightened by someone/something?

4 ¡

Thematic fit datasets

5 ¡

Judgements ¡from ¡Padó ¡(2007) ¡

Challenges to judgement well-formedness

Alice played

6 ¡

croquet soccer piano cheese in the garden

Role-filler frequency

7 ¡

How common is it for croquet/soccer to be played? Google ngram (Michel et al., 2010) comparison of “croquet” and “soccer”

Polysemy

8 ¡

How common is it for soccer/the piano to be played? Right: polysemy versus frequency of the most frequent verbs in COCA. Corpus obtained from Davies (2008).

Sense frequency

9 ¡

Research question

How do

affect thematic fit judgements?

10 ¡

Stimuli selection

11 ¡

New formulation of the question

How common is it for croquet to be played? Google ngram (Michel et al., 2010) comparison of “croquet” and “soccer”

12 ¡

New formulation of the question

13 ¡

Agreement scale: croquet is something that is played. Google ngram (Michel et al., 2010) comparison of “croquet” and “soccer”

Verb selection

14 ¡

Stimuli examples

Filler ¡type ¡ Frequency ¡ whip ¡(1686, ¡6 ¡SynSets) ¡ punish ¡(2908, ¡1 ¡SynSet) ¡ 15 ¡

To find a good patient-filler, query COCA for: VERB [at*] [nn*]

Stimuli examples

Filler ¡type ¡ Frequency ¡ whip ¡(1686, ¡6 ¡SynSets) ¡ punish ¡(2908, ¡1 ¡SynSet) ¡ Good ¡ horse ¡(32384) ¡

16 ¡

Find a much higher or lower (~10x) frequency synonym.

Stimuli examples

Filler ¡type ¡ Frequency ¡ whip ¡(1686, ¡6 ¡SynSets) ¡ punish ¡(2908, ¡1 ¡SynSet) ¡ Good ¡ high ¡ horse ¡(32384) ¡ criminal ¡(9271) ¡ low ¡ stallion ¡(818) ¡

17 ¡

For POLYSEMOUS verbs, repeat for second sense.

Stimuli examples

Filler ¡type ¡ Frequency ¡ whip ¡(1686, ¡6 ¡SynSets) ¡ punish ¡(2908, ¡1 ¡SynSet) ¡ Sense1 ¡ high ¡ horse ¡(32384) ¡ criminal ¡(9271) ¡ low ¡ stallion ¡(818) ¡

Sense2 ¡ high ¡ cream ¡(19727) ¡ low ¡ frosDng ¡(905) ¡ 18 ¡

Randomly shuffle good patient-fillers to assign poor ones.

Stimuli examples

Filler ¡type ¡ Frequency ¡ whip ¡(1686, ¡6 ¡SynSets) ¡ punish ¡(2908, ¡1 ¡SynSet) ¡ Sense1 ¡ high ¡ horse ¡(32384) ¡ criminal ¡(9271) ¡ low ¡ stallion ¡(818) ¡

Sense2 ¡ high ¡ cream ¡(19727) ¡ low ¡ frosDng ¡(905) ¡ Bad ¡ high ¡ party ¡(118292) ¡ criminal ¡(9271) ¡ low ¡ gathering ¡(7025) ¡

19 ¡

Reshuffle all of the ones that are too good.

Stimuli examples

Filler ¡type ¡ Frequency ¡ whip ¡(1686, ¡6 ¡SynSets) ¡ punish ¡(2908, ¡1 ¡SynSet) ¡ Sense1 ¡ high ¡ horse ¡(32384) ¡ criminal ¡(9271) ¡ low ¡ stallion ¡(818) ¡

Sense2 ¡ high ¡ cream ¡(19727) ¡ low ¡ frosDng ¡(905) ¡ Bad ¡ high ¡ party ¡(118292) ¡ baby ¡(70498) ¡ low ¡ gathering ¡(7025) ¡ fetus ¡(2329) ¡ 20 ¡

Filler items: the 240 most frequent triples from McRae et al. (1998)

Procedure

21 ¡

Ø Rewrite each verb in its past-participle form. Ø Normalize each role-filler to singular with appropriate determiner. Ø Choose either the +human or the –human template:

§ +human: __ is someone who is ___ § –human: __ is something that is ___

Ø One survey

§ 6 POLYSEMOUS, 4 MONOSEMOUS, 5 fillers § Workers do not see a verb in more than one condition § Compensation: $0.15 § 159 workers participated, 10 ratings per item.

ANOVA results: polysemy-fit interaction

22 ¡

Follow-up ANOVAs

Ø Good: polysemy (***), frequency (**) Ø Bad: polysemy (***), frequency ( ) Ø POLYSEMOUS: fit (***), frequency ( . ) Ø MONOSEMOUS: fit (***), frequency (***)

23 ¡

Comparing senses

24 ¡ More ¡frequent ¡sense ¡Less ¡frequent ¡sense ¡

Greenberg, Sayeed, and Demberg (2015)

To find a good patient-filler, query COCA for: VERB [at] [nn]

§ +human: is someone who is _ § –human: is something that is _

Ø Good: polysemy (), frequency () Ø Bad: polysemy (), frequency ( ) Ø POLYSEMOUS: fit (), frequency ( . ) Ø MONOSEMOUS: fit (), frequency (*)

25 ¡ spoon ¡ Verb ¡eat, ¡“with”-‑preposiDonal ¡object ¡ knife ¡ hand ¡ fork ¡ f r i e n d ¡ f a m i l y ¡ gusto ¡ cluster ¡3 ¡ centroid ¡ cluster ¡2 ¡ centroid ¡ cluster ¡1 ¡ centroid ¡

Method ¡ Spearman’s ¡rho ¡(TypeDM), ¡range ¡= ¡[-‑1,1] ¡ Centroid ¡ 0.53 ¡ OneBest ¡ 0.54 ¡ kClusters ¡ 0.55 ¡

Data ¡from ¡this ¡project ¡available ¡at ¡hXp://rollen.mmci.uni-‑saarland.de/ ¡ ¡