[PPT] - MORE THAN WORDS A DISCRIMINATIVE LEARNING MODEL WITH LEXICAL PowerPoint Presentation

SLIDE 1

MORE THAN WORDS

A DISCRIMINATIVE LEARNING MODEL WITH LEXICAL BUNDLES

March 8th, 2017

Saskia E. Lensink, R. Harald Baayen s.e.lensink@hum.leidenuniv.nl

SLIDE 2

A typology of multi-word units

3

Wray (2012)

SLIDE 4

Multi-word units

■ Indicator of nativen eness ess ■ Thought to be repres resent nted ed as a whole

le

■ How can we exper perime imentally ntally test t for the cognitive reality of these multi-word units?

4

SLIDE 5

Multi-word frequencies

5

Previous studies have found an effect of frequencies

f regular multi-word units

suggests storage

rage of wholes

les

SLIDE 6

Previous studies

■ self-paced reading Tremblay, Derwing, Libben, & Westbury, 2011 ■ phrasal decision tasks Arnon & Snider, 2010; Ellis & Simpson-Vlach, 2009 ■ priming of the last word of the ngram Ellis & Simpson-Vlach, 2009 ■ word reading tasks Arnon & Priva, 2013; Ellis & Simpson-Vlach, 2009;

Han, 2015; Tremblay & Tucker, 2011

■ picture naming Janssen & Barber, 2012 ■ sentence recall Tremblay et al., 2011 ■ immediate free recall Tremblay & Baayen, 2010 ■ eye-tracking Siyanova-Chanturia, Conklin, & Van Heuven, 2011 ■ ERPs Tremblay & Baayen 2010 ■ L1 language acquisition Bannard & Matthews, 2008 ■ L2 speakers Conklin & Schmitt, 2012; Han, 2015;

Jiang & Nekrasova, 2007; Siyanova-Chanturia et al, 2011

6

SLIDE 7

Frequency is an impoverished measure

■ Collapses counts of homo

mopho

hone nes ■ Collapses counts of different rent senses nses ■ Language always occurs in context xt – prediction also plays a large role in processing ■ Salien ence ce and recen cency cy also play a role

7

SLIDE 8

Mind the neighbors!

■ When studying words, we pay attention to – Frequency effects – Length – Neighborhood density effects ■ When studying multi-word units, we pay attention to – Frequency effects – Length – But ut not

t to

to neighbo ghborho hood

d densit

nsity effects ects!

8

SLIDE 9

Motivation for our study

■ We know that the framework of discriminative learning has given us some new insights into language ■ A computational model implementing discriminative learning, NDL, provides us with a measure reflecting neighborhood density effects ■ When adding features of discriminative learning to our models of the processing of multi-word units, we might gain new insights into the processing of multi-word units ■ We conducted both an eye-tracking and a production study to study comprehension and production

9

SLIDE 10

NDL

Baayen et al., 2011

■ Naïve Discriminative Learning ■ Implements Rescorla-Wagner equations that specify how experience alters the strength of association of a cue cue to a given

utcome

come ■ Distributional properties of corpus data used, using basic principles of error-dri driven en learn rning ing ■ Weight from cues to outcomes adjus usted ed depending on corre rect ct/inc incorre rrect ct predict iction

n of an outcome given a certain cue

This approach successfully predicted word frequency effects, morphological family size effects, inflectional entropy effects, and phrasal frequency effects

10

SLIDE 11

NDL

Baayen et al., 2011

■ Outcomes are thought of as point nter ers s to locati tions

ns in a multi-

dimensional semanti mantic c space ce ■ These locations are const stantl antly y up updated ed by the experiences a language user has

11

SLIDE 12

NDL with lexical bundles

12

SLIDE 13

Weight word X

13

Bottom-up information

SLIDE 14

Total activation trigram (act)

14

Bottom-up information

SLIDE 15

Prior activation trigram

15

Top-down information

SLIDE 16

Activation diversity

16

Competing trigrams – neighborhood density

SLIDE 17

Eye-tracking experiment

■ Plaatje eye-tracker/oog oid

17

Ey Eye trac e tracking king

SLIDE 18

Stimuli

18

■ most common n-grams (trigrams) from corpus ■ OpenSoNaR corpus ■ Use frequencies extracted from a corpus

f Dutch subtitles (N =

109,807,716)

SLIDE 19

Procedure

19

■ Silent reading ■ Comprehension questions to ascertain attentive reading ■ 30 participants (10 male) ■ Analyzed using generalized additive mixed-effects models (GAMMS)

SLIDE 20

Modeling data

■ See if and to what extent NDL measures gives us more insights over and above more traditional frequency measures ■ Some frequency and NDL measures show high amount of colline ineari rity ty – e.g. ‘freqABC’ and ‘prior’ ■ Models with just frequencies performed worse than models with both frequencies and NDL measures ■ Neighborhood density effects are best reflected by the Activation Diversity measure, which was a significant predictor in several models

20

SLIDE 21

First fixation durations

21

ActDivTrigram FreqABC FreqC ActDivTrigram firstFixX FreqABC firstFixX

firstFixX

SLIDE 22

Second fixation durations

22

secondFixX length prior Weight word 3

SLIDE 23

Number of fixations

23

secondFixX firstFixX

SLIDE 24

Discussion eye-tracking data

■ Already in the first fixation effects of the trigram frequencies and third word ■ Processes of top down n infor

rmat

mation

n (freq

equenc ency effects ects), bott

ttom
m-up

up informati

rmation
n (acti

ctivations ations) ) and uncer certainty tainty reduc uction tion (activ tivation ation di diversi ersity ty/nei neighbor ghborhood hood effects ects) ■ Knowled wledge ge verif rificati cation

n (freq

equenci uencies es): a reader spends more time in early measures with higher frequencies and if enough information is available – if not, a new fixation is planned asap ■ Bott

ttom
m-up

up informatio

rmation (w3):

3): when further into the trigram at your second fixation, it pays to spend more time to resolve things locally if the third word provides a lot of support for the trigram. If not, participants are faster to refixate ■ uncer ertainty tainty reduct uction

n (nei

eigh ghbor borho hood

d densi

nsity) y): if there are many competing trigrams, shorter looking times in first fixations and a higher number of fixations.

24

SLIDE 25

General discussion

■ Multi-word units are relevant ant un unit of storage age (also in Dutch) ■ Both single le words ds and the ful ull trigram ram play a role ■ Adding measures from a discrimina criminativ tive mode del provides us with new w insight ights into the processing of MWUs ■ Considering neigh ghbor borhoo

od

d densi ensity ty effec ects ts provides us with more insights into the workings of MWU processing ■ In processing of multi-word units, opposing forces of top-do down n inform

rmati

tion

n, bott
ttom
m-up

up informa

rmati

tion

n and un

uncer ertainty tainty reduc ducti tion

n

are at work

25

SLIDE 26

26

Qu Questions? estions?

SLIDE 27

Extra slides – production

27

SLIDE 28

Production experiments

28

SLIDE 29

Procedure

29

■ Same stimuli as used in the eye-tracking study ■ Word reading task ■ 30 participants (8 male) ■ Onsets and durations measured using Praat ■ Analyzed using generalized additive mixed effect models (GAMMs)

SLIDE 30

Production onsets

30

SLIDE 31

Production durations

31

SLIDE 32

A trade-off

32

naming latencies durations

SLIDE 33

Discussion production data

■ Processes of top down n informa mation

n (frequen

ency cy effects ts), bot

ttom
m-

up informati mation

n (acti

tivat ations

ns)

) and unc ncertainty tainty reduct ction ion (activat ation ion diversity ity/nei neighb ghbor

rhood
od effects)

■ There is a trade ade-off between starting early and being able to pronounce the trigram fast ■ Top-down wn informati mation

n slows you down at first, but makes total

durat ration

ns shorter

er (longer to plan, but easier motor program to execute) ■ Bott

ttom-up

up informa rmation tion gives you a quick ck start but slows you down later (shorter to plan, but harder motor program to execute) ■ Neighb hbor

rhood
od effects apparent in produc

ducti tion

n durat

ration

ns – longer

durations when the number of neighbors is different from the average (less motor practice)

33

MORE THAN WORDS A DISCRIMINATIVE LEARNING MODEL WITH LEXICAL - - PowerPoint PPT Presentation

MORE THAN WORDS

Contents

A typology of multi-word units

Multi-word units

Multi-word frequencies

Previous studies

Frequency is an impoverished measure

Mind the neighbors!

Motivation for our study

NDL

NDL

NDL with lexical bundles

Weight word X

Total activation trigram (act)

Prior activation trigram

Activation diversity

Eye-tracking experiment

Ey Eye trac e tracking king

Stimuli

Procedure

Modeling data

First fixation durations

Second fixation durations

Number of fixations

Discussion eye-tracking data

General discussion

Qu Questions? estions?

Extra slides – production

Production experiments

Procedure

Production onsets

Production durations

A trade-off

Discussion production data