Everything will be online Phonatics blackboard site; if you need - - PowerPoint PPT Presentation

everything will be online
SMART_READER_LITE
LIVE PREVIEW

Everything will be online Phonatics blackboard site; if you need - - PowerPoint PPT Presentation

Everything will be online Phonatics blackboard site; if you need access email me. These slides Two example datasets Papers are available as well: Continous measures: http://dx.doi.org/10.1016/j.jml.2007.12.005 Categorical


slide-1
SLIDE 1

Everything will be online

  • Phonatics blackboard site; if you need access email me.
  • These slides
  • Two example datasets
  • Papers are available as well:

Continous measures: http://dx.doi.org/10.1016/j.jml.2007.12.005 Categorical measures: http://dx.doi.org/10.1016/j.jml.2007.11.007 Comprehensive review in Baayen’s book: http://www.ualberta.ca/~baayen/publications/baayenCUPstats.pdf

slide-2
SLIDE 2

Parametric Statistical Analyses

  • Dependent variable: What you’re measuring

– RT, accuracy, VOT – Can be categorical (accuracy) or continuous (RT)

  • Factors: Things that influence this dependent variable

– Frequency of word, person who said it, lexical identity

  • f the word

– Can be categorical (place of articulation) or continuous (frequency)

  • (Statistical) Model: Uses factors to make predictions about

the dependent variable. – Focus today: linear models – Model is a linear sum of predictors – Note that predictors can be single factors (frequency) as well as interactions (frequency * grammatical category)

slide-3
SLIDE 3

Why ANOVA?

  • Pairwise comparisons introduce errors. Lots of factors =

lots of conditions = lots of comparisons = lots of

  • pportunities for false positives.

– ANOVA avoids this by testing a single measure—variance—across groups.

  • Factors are not all fixed.

– Sometimes we examine the influence of a factor that has a finite number of possible levels, all of which we sample (e.g., there are 3 places of articulation for English oral stops; our experiment includes b,d,g). – In other cases, we have a random sample from among an infinite set of possible levels (e.g., we run 10 NU undergraduates; the space of possible undergraduate humans is infinite). – ANOVA allows us to treat one factor as random.

slide-4
SLIDE 4

Issues with ANOVA

  • Dependent variables are not always continuous.

Converting or transforming an inherently categorical measure like accuracy into a semi-continuous measure (e.g., % correct) introduces errors (even with rau transform).

  • Factors are not always categorical. ANOVA functions

best with categorical factors; some limited success with techniques such an ANCOVA.

  • Sometimes you have multiple random factors. Both

subjects and items can be random; no way to incorporate both into a single analysis.

slide-5
SLIDE 5

Mixed Effects Regression

  • Dependent variables are not always continuous.

– Output of statistical model can either be directly related to continuous dependent variable, or linked by a logistic function [= probability of one categorical output]. – Output can be RT or Pr(correct).

  • Factors are not always categorical.

– Mixed effects allows for both continuous as well as categorical predictors.

  • Sometimes you have multiple random factors.

– Random factors are just additional terms in the regression; can incorporate more than one.

slide-6
SLIDE 6

Prerequisites

  • A current version of R
  • languageR libraries

– In R, go to the Packages menu – Find language R; install it, and it will also install all the

  • ther libraries it needs.
  • Some data (factors + observations along a dependent

measure)

slide-7
SLIDE 7

Formatting Your Data

  • Each line in the file should be a SINGLE observation.

– Only ONE column should have data in it. – This should be an observation, not an average or total. – Save as tab delimited txt file

  • Example: VOT data.

Subject Word POA Voicing Freq VOT S001 bat labial voiced 10.5 13.2 Note: Categorical variables are text; don’t use 001 for Subject 1

  • Example: Accuracy data.

Subject Lang SNR Trial Accuracy S001 Eng 3 1 Correct Note: Put each trial, correct vs. incorrect!

slide-8
SLIDE 8

Data set 1: VOT

  • Correct productions from a tongue twister experiment.
  • Effect of voicing, place; no interaction
slide-9
SLIDE 9

Analysis in R

  • Read in data

> dat = read.delim("vot.txt")

  • Build your model

> dat.lmer = lmer(VOT~Voicing*POA+(1|Subject),data=dat)

  • lmer = “Linear mixed effects regression”
  • Model is specified by referring to column headers ( =

factor names) – VOT ~ : Predict VOT – Voicing * POA: Use a full factorial model combining Voicing and place – +(1|Subject): Allow each subject to have their own ‘baseline’ VOT (= intercept) – data = dat: Variable where data is stored

slide-10
SLIDE 10

Analysis in R

  • Results

> dat.lmer Linear mixed model fit by REML Formula: VOT ~ Voicing * POA + (1 | Subject) Data: dat AIC BIC logLik deviance REMLdev

  • 4888 -4849 2452 -4969 -4904

Random effects: Groups Name Variance Std.Dev. Subject (Intercept) 0.00004044 0.0063593 Residual 0.00025566 0.0159894 Number of obs: 918, groups: Subject, 10 Reminder of what model is Characterization of how well it fits data Contribution of random effects, plus

  • verall model, has

some degree of intrinsic error (normally distributed)

slide-11
SLIDE 11

Fixed effects: Estimate Std. Error t value (Intercept) 0.019582 0.002753 7.114 Voicingvoiceless 0.025920 0.002384 10.872 POAbilabial -0.004416 0.002091 -2.112 POAvelar 0.009327 0.002442 3.819 Voicingvoiceless:POAbilabial 0.002404 0.002773 0.867 Voicingvoiceless:POAvelar 0.004845 0.003220 1.505 Estimate (with error) of direction and degree of factor’s influence on dependent measure. t ≈ student’s T Categorical variables are ‘dummy’ coded as 1/0. R sort the levels alphabetically and assigns 0 to the first one. If there is more than one level, split into two variables.

slide-12
SLIDE 12

Side note: Random effects

  • Each level of random effect gets its own intercept

> ranef(dat.lmer) $Subject (Intercept) s1 -5.989105e-03 s10 -7.125361e-03 s2 3.499354e-03 s3 4.140038e-03 s4 6.205415e-05 s5 -2.179526e-03 s6 -1.000678e-02 s7 6.203679e-03 s8 5.616277e-03 s9 5.779373e-03

  • “random” component--assume there is an independent source of error
  • n these factors

– Other individuals you might test would also have different intercepts; they are randomly distributed around current sample of participants

slide-13
SLIDE 13

Fixed effects: Estimate Std. Error t value (Intercept) 0.019582 0.002753 7.114 Voicingvoiceless 0.025920 0.002384 10.872 Positive coefficient estimate: Relative to voiced (reference level), voiceless consonants have longer VOTs Estimate (with error) of direction and degree of factor’s influence on dependent measure. t ≈ student’s T

slide-14
SLIDE 14

Fixed effects: Estimate Std. Error t value (Intercept) 0.019582 0.002753 7.114 POAbilabial -0.004416 0.002091 -2.112 POAvelar 0.009327 0.002442 3.819 What is reference level for POA? Estimate (with error) of direction and degree of factor’s influence on dependent measure. t ≈ student’s T

slide-15
SLIDE 15

Fixed effects: Estimate Std. Error t value (Intercept) 0.019582 0.002753 7.114 POAbilabial -0.004416 0.002091 -2.112 POAvelar 0.009327 0.002442 3.819 What is reference level for POA? Coronal. What does positive coefficient for velar and negative coefficient for bilabial mean? Estimate (with error) of direction and degree of factor’s influence on dependent measure. t ≈ student’s T

slide-16
SLIDE 16

Fixed effects: Estimate Std. Error t value (Intercept) 0.019582 0.002753 7.114 POAbilabial -0.004416 0.002091 -2.112 POAvelar 0.009327 0.002442 3.819 What is reference level for POA? Coronal. What does positive coefficient for velar and negative coefficient for bilabial mean? Bilabials have shorter VOTs than coronals; velars have longer VOTs than coronals. Estimate (with error) of direction and degree of factor’s influence on dependent measure. t ≈ student’s T

slide-17
SLIDE 17

Correlation of Fixed Effects: (Intr) Vcngvc POAblb POAvlr Vcngvclss:POAb Voicngvclss -0.496 POAbilabial -0.580 0.648 POAvelar -0.484 0.555 0.640 Vcngvclss:POAb 0.429 -0.857 -0.744 -0.478 Vcngvclss:POAv 0.360 -0.729 -0.482 -0.750 0.630 Last bit of output: Table summarizing correlation of

  • predictors. If they are highly correlated, model might not

be accurately characterizing the data.

slide-18
SLIDE 18

Significance

  • Method 1: Significance of a factor = coefficient of the

factor is not equal to 0.

  • We have t-values but not p-values, what’s up with that?
  • Calculating degrees of freedom for these t-values is

extremely complicated.

  • Baayen’s approach: Use Markov Chain Monte Carlo

(MCMC) sampling to estimate distribution of the parameters of the statistical model. – MCMC is just a means of randomly sampling from probability distributions. – Here, we randomly sample from the probability distribution of model estimates.

slide-19
SLIDE 19

Significance

  • Command

> pvals.fnc(dat.lmer)

  • Will generate a set of figures giving you the distribution of

each parameter.

  • Critical output: Proportion of MCMC samples where

estimate > or < 0. – Voicing: < .0001 – POA bilabial: .0368 – POA velar: .0004 – Voicing * Bilabial: .3874 – Voicing * Velar: .1384

slide-20
SLIDE 20

Significance

  • Method 2: Model comparison
  • Construct a simple model, then a more complex model.
  • If the more complex model has a significantly better fit to

the data, the additional factor makes a significant contribution.

slide-21
SLIDE 21

Significance

> voicing.lmer = lmer(VOT~Voicing+(1|Subject),data=dat) > voicingPOA.lmer = lmer(VOT~Voicing+POA+(1|Subject),data=dat) > anova(voicing.lmer,voicingPOA.lmer) Data: dat Models: voicing.lmer: VOT ~ Voicing + (1 | Subject) voicingPOA.lmer: VOT ~ Voicing + POA + (1 | Subject)

Df AIC BIC logLik Chisq Chi Df Pr(>Chisq) voicing.lmer 4 -4837.3 -4818.0 2422.6 voicingPOA.lmer 6 -4954.9 -4926.0 2483.4 121.60 2 < 2.2e-16 ***

  • Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Performs a likelihood ratio test (comparison of model fits)

slide-22
SLIDE 22

Significance

In this case you get similar results

  • -Adding POA improves fit
  • -Interaction does not.
slide-23
SLIDE 23

Random Effects

  • Could add in other random effects on intercept (e.g.,

different intercepts for each item).

  • Can also have random effects on slope.

– Allow the contrast between voiced and voiceless to vary across participants. – Model specification > dat2.lmer = lmer(VOT~POA*Voicing+(1+Voicing|Subject),data=dat)

  • Can then use likelihood ratio to see if including this

random effect improves fit (it does).

  • However, the pvals function doesn’t work with this type of

random effect.

  • Likelihood ratio method still works--in this case, the

interaction gets closer to significance (driven by velars).

slide-24
SLIDE 24

Data set 2: Accuracy

  • Kristin’s data: Speech perception in babble, varying native

language of speaker and language of babble.

  • Interaction: Mandarin babble is much worse for Mandarin

speakers than English speakers.

slide-25
SLIDE 25

Data set 2: Accuracy

  • Issue with ANOVA: categorical variable!
  • Also: Missing data.

– For one subject, few sentences are skipped.

  • (If we included SNR we’d be in even more trouble; it’s not a
  • rdinal, not categorical factor.)
  • Mixed effects doesn’t care!

– It’s predicting probability; model explicitly takes into account fact that response is a probability measure. – The data that serve as input are individual trials; if number is different across subjects, that’s ok. – Can use continuous or categorical predictors.

slide-26
SLIDE 26

Analysis in R

  • Similar to above, but specify a different link function.

> dat = read.delim("kv.txt") > dat.lmer=lmer(Accuracy~NativeLang*Babble+(1|Subject), data=dat,family="binomial")

  • “binomial” specified that output of regression model is

related to dependent measure via a logistic function ( = Probability of one response)

  • Because response is a categorical measure, usual R methods

apply. – Since correct is first in the alphabet, it is the reference level; the model is predicting the probability of incorrect

slide-27
SLIDE 27

Analysis in R

Output looks similar, but instead of t you have z scores, with probabilities. Fixed effects: Estimate Std. Error z value Pr(>|z|) (Intercept) -1.3757 0.1483 -9.278 < 2e-16 *** NativeLangMandarin 0.1980 0.2079 0.952 0.341 BabbleMan -1.1898 0.1415 -8.411 < 2e-16 *** NativeLangMandarin:BabbleMan 0.7425 0.1809 4.104 4.06e-05 *** Note that estimates do not refer to literal changes in the dependent measure; they refer to changes in probability

slide-28
SLIDE 28

Analysis in R

Fixed effects: Estimate Std. Error z value Pr(>|z|) BabbleMan -1.1898 0.1415 -8.411 < 2e-16 *** Mandarin Babble is less likely (negative coefficient) to result in an incorrect response. NativeLangMandarin:BabbleMan 0.7425 0.1809 4.104 4.06e-05 *** For Mandarin speakers, Mandarin babble is associated with high probability of error (relative to the overall trend)

slide-29
SLIDE 29

Nobody’s perfect

  • Models still assume each observation is independent. This

is patently false (what you do on one trial influences the next). – Baayen--includes behavioral measures from previous trial as predictors

  • Error is assumed to be normally distributed.

– ‘Residual’ distribution of data, after subtracting out factors, is assumed to be normally distributed. Also patently false. – Baayen--to discover if these are skewing the results, exclude outliers (data points not well fit by model) and re-fit regression model. If fit changes--violations of normality are problematic.