[PDF] - Modelling word perception and comprehension across modalities PDF Document

SLIDE 1

15/09/2017 1 PhD student: Danny Merkx Supervisors: Stefan Frank (CLS) Mirjam Ernestus (CLS) Raquel Fernandez (ILLC) Louis ten Bosch (CLS)

Modelling word perception and comprehension across modalities

Psychology in Big Question 1

Research objectives

Develop a (cognitively plausible) vector

representation of word form

Apply these in a computational model that

simulates spoken and written perception

Investigate the interplay between form and

meaning, and its role in learning and comprehension

[kat]

SLIDE 2

15/09/2017 2

Written: Interactive Activation

(McClelland & Rumelhart, 1981)

Spoken: TRACE (McClelland & Elman, 1986)

Challenge #1

Different modalities in a single model

Current DSMs are amodal
Word-perception models lack semantics and deal with one

modality

Reading is not independent from speech perception
How to capture the unique perceptual constraints posed by

different modalities in a vector model?

Challenge #2

Complex relations between form, identity, and meaning

data data

Meanings Identities Forms

Data dates [de:ts] dates2 dates3 [da:ta:] [deɪts] dates

17 dec 1903 20 jul 1969 14 sep 2017

[deɪtə] [datə]

SLIDE 3

15/09/2017 3

Simply running a DSM on a bilingual corpus would result in

clustering by language → vectors do not reflect semantics

(Most?) current work in bilingual DSMs:

− Get two monolingual vector spaces − Combine such that translation equivalents receive similar vectors

No way to tell the languages apart

− Not evaluated against human processing data

Challenge #2

Models of the bilingual lexicon

Artetxe, Labaka, & Agirre (ACL, 2017)

The bilingual mental lexicon: Languages are integrated but can

still be told apart

Psycholinguistic models of the bilingual mental lexicon:

‒ Account for response/naming times, cognate/homograph effects, interlingual priming, etc. ‒ But: small vocabularies, not trainable, no realistic semantics

Costa et al. (Cognitive Science, 2017)

Challenge #2

Models of the bilingual lexicon

SLIDE 4

15/09/2017 4

Challenge #3

Psycholinguistic evaluation

How to measure the psychological accuracy of word vectors?
How to evaluate representations on human data from

sentence/discourse comprehension?

Evaluation of statistical language models based on word

surprisal:

‒ Correlate to measure of processing difficulty

n naturalistic materials

Reading times from eye-tracking study

Frank & Thompson (Proc. CogSci, 2012)

N400 size (content words only)

Frank, Otten, Galli, & Vigliocco (Brain & Language, 2015)

Average surprisal Fit to RT (±χ2) Average surprisal Fit to N400 amplitude (±χ2)

Challenge #3

Psycholinguistic evaluation

SLIDE 5

15/09/2017 5

Challenge #3

Psycholinguistic evaluation

How to measure the psychological accuracy of word vectors?
How to evaluate representations on human data from

sentence/discourse comprehension?

Evaluating statistical language models based on word

surprisal:

‒ Correlate to measure of processing difficulty

n naturalistic materials

‒ Independent measure of linguistic accuracy is helpful for model comparison ‒ Model variants are cognitively intepretable

Not (so much) for DSMs

Mandera, Keuleers, & Brysbaert (2017):

– Compared state-of-the-art DSMs (Skipgram, CBOW) and traditional count-based DSM, on wide range of parameter values

Psycholinguistic evaluation of DSMs

Recent work

SLIDE 6

15/09/2017 6

Mandera, Keuleers, & Brysbaert (2017):

– Compared state-of-the-art DSMs (Skipgram, CBOW) and traditional count-based DSM, on wide range of parameter values – Implicit measures (response times in semantic priming): fairly small difference between model types – Explicit norms (association, semantic relatedness): CBOW is best

Rotaru, Vigliocco, & Frank (submitted):

– Markov chain over semantic distance matrix (from CBOW, GloVe, LSA) simulates dynamics in semantic network

Psycholinguistic evaluation of DSMs

Recent work

SLIDE 7

15/09/2017 7

Mandera, Keuleers, & Brysbaert (2017):

– Compared state-of-the-art DSMs (Skipgram, CBOW) and traditional count-based DSM, on wide range of parameter values – Implicit measures (response times in semantic priming): fairly small difference between model types – Explicit norms (association, semantic relatedness): CBOW is best

Rotaru, Vigliocco, & Frank (submitted):

– Markov chain over semantic distance matrix (from CBOW, GloVe, LSA) simulates dynamics in semantic network – This improves fit to human data (association/relatedness norms, lexical/semantic decision times and accuracies) – CBOW outperformed GloVe and LSA in almost all tests

Psycholinguistic evaluation of DSMs

Recent work

SLIDE 8

15/09/2017 8

Frank & Willems (2017):

– Naturalistic materials: UCL corpus sentences (written) and excerpts from Dutch audiobooks (spoken) – Cosine distance (using Skipgram) between each content word and sum

f previous content words

Psycholinguistic evaluation of DSMs

Recent work

Unique effects of suprisal and semantic distance

Frank & Willems (Language, Cognition and Neuroscience, in press)

SLIDE 9

15/09/2017 9

Frank & Willems (2017):

– Naturalistic materials: UCL corpus sentences (written) and excerpts from Dutch audiobooks (spoken) – Cosine distance (using Skipgram) between each content word and sum

f previous content words

– Explained variance in N400 and BOLD responses offers possibilities for comparing DSMs

But reading times appear to be insensitive to semantic

distance

Psycholinguistic evaluation of DSMs

Recent work

Current semantic distance Previous semantic distance UCL corpus Dundee corpus

FF FP RB GP

0.01

0.01 0.02

reading time measure coefficient

FF FP RB GP

0.02

0.02 0.04

0.02

0.02 0.04

coefficient

FF FP RB GP FF FP RB GP

0.01

0.01 0.02

reading time measure

no surprisal 2-gram 3-gram 4-gram 5-gram

Reading times from two eye-tracking corpora: (no) effect of semantic distance

Frank (Proc. CogSci, 2017)

SLIDE 10

15/09/2017 10

Potential pitfalls

for BQ1 collaboration

Different opinions about the meaning and importance of

cognitively plausibility

Different opinions about model evaluation: task performance

versus human performance

BQ1’s goal to link between neurobiology and cognition does

not mean that psychology must be reduced to neuroscience

Behavioural data is relevant too: Not all questions are about

15/09/2017 1 PhD student: Danny Merkx Supervisors: Stefan Frank (CLS) Mirjam Ernestus (CLS) Raquel Fernandez (ILLC) Louis ten Bosch (CLS)

Modelling word perception and comprehension across modalities

Psychology in Big Question 1

Research objectives

representation of word form

simulates spoken and written perception

meaning, and its role in learning and comprehension

[kat]

15/09/2017 2

Challenge #1

Different modalities in a single model

modality

different modalities in a vector model?

Challenge #2

Complex relations between form, identity, and meaning

Meanings Identities Forms

15/09/2017 3

clustering by language → vectors do not reflect semantics

Challenge #2

Models of the bilingual lexicon

still be told apart

Challenge #2

Models of the bilingual lexicon

15/09/2017 4

Challenge #3

Psycholinguistic evaluation

sentence/discourse comprehension?

surprisal:

Challenge #3

Psycholinguistic evaluation

15/09/2017 5

Challenge #3

Psycholinguistic evaluation

sentence/discourse comprehension?

surprisal:

Not (so much) for DSMs

Psycholinguistic evaluation of DSMs

Recent work

15/09/2017 6

Psycholinguistic evaluation of DSMs

Recent work

15/09/2017 7

Psycholinguistic evaluation of DSMs

Recent work

15/09/2017 8

Psycholinguistic evaluation of DSMs

Recent work

Unique effects of suprisal and semantic distance

15/09/2017 9

distance

Psycholinguistic evaluation of DSMs

Recent work

Reading times from two eye-tracking corpora: (no) effect of semantic distance

15/09/2017 10

Potential pitfalls

for BQ1 collaboration

cognitively plausibility

versus human performance

not mean that psychology must be reduced to neuroscience

the brain and model comparison may be more difficult on high-dimensional (neural) data