Why all of this talk of populations, parameters, samples, and - PDF document

Today - A little on standard error of the mean and variation in estimates of central tendency. - A rough and ready primer on linear mixed effects models, a commonly- used tool for statistical analysis of experimentally collected data in (psycho-)linguistic circles. 234

Why all of this talk of populations, parameters, samples, and statistics? For simplicity, let’s imagine that we only have two conditions in our experiment. And let’s imagine that we test our conditions on two different sets of 28 people (that’s a between-participant design). We want to know if the two conditions are different (or have different effects on our participants). One way of phrasing this question is that we want to know if our two samples come from different populations, or whether they come from the same population: control target same population x 28 x 28 x 56 So here is one mathematical thing we can do to try to answer this question. We can calculate the mean for each sample, and treat them as estimates of a population mean. Then we can look at those estimates and ask whether we think they are two estimates of one population mean, or whether they are two distinct estimates of two distinct population means. 235

Standard Error: How much do samples vary? How can we tell if two sample means are from the same population or not? Well, one logic is as follows: First, we expect sample means to vary even though they are from the same population. Each sample that we draw from a population will be different, so their means will be different. The question is how much will they vary? We could, in principle, figure this out by collecting every possible sample from a population. If we calculated a mean for each one, those sample means would form a distribution. We could then calculate the variance and standard error of that distributions. That would tell us how much sample means vary when they come from the same population! We call this the sampling distribution population x 10,000 of the mean. sample 1 x 20 = x ̄ 1 Its mean is the mean of the population that the samples come from. sample 2 x 20 = x ̄ 2 Its standard deviation is called the sample 3 x 20 = x ̄ 3 standard error of the mean . … to 10,000 choose 20 … 236

Plotting the sampling distribution of the mean population x 10,000 1000 In the script parameters.statistics.r, I used 750 R to generate a population of 10,000 values with a mean of 0 and a standard count 500 deviation of 1. We’ve already seen this. 250 0 sample x 20 = x ̄ 1 -2.5 0.0 2.5 p 200 I then took 1,000 samples from the population, each with 20 values. I calculated the mean for each one, and 150 plotted that distribution. This is a simulation of the sampling distribution of count 100 the mean. 50 The mean of the sampling distribution of the means is the population mean! 0 237 -1.0 -0.5 0.0 0.5 1.0 m

Estimating the standard error 200 The standard deviation of the sampling distribution of the mean is called the standard error . We can calculate it from 150 the simulated distribution using the standard deviation formula. The result for count 100 our simulation is plotted in blue to the right. (We typically don’t have this 50 distribution in real life, so we can’t simply calculate it. We have to estimate it.) 0 -1.0 -0.5 0.0 0.5 1.0 m To estimate standard error from a sample we use the formula: s/ √ n . In real life, you 120 usually have one sample to do this. But we have 1000 samples in our simulation, so we can calculate 1000 estimates. To see 80 how good they are, we can calculate the count difference between each estimate and the empirical standard error calculated above. 40 Here is the distribution of those differences. As you can see, the mean is 0 very close to 0. They are good estimates! 238 -0.10 -0.05 0.00 0.05 0.10 differences

Working with sample means: A problem Before going a bit further, I want to introduce a concrete example for us to talk about: In the fictional 51st state of Western Massia, Prof. Dylan O’Brien sets out to measure the rate of Specific Language Impairment in the population. They measure the rate of SLI incidence by town. They find that in the smallest town—Ammerste, pop 90—they observe the highest rate of SLI in the population; it is 30% SLI. It’s more than twice the rate in the largest town in Western Massia: Belle-chère Town population 35,000. In Belle-chère Town, the rate of SLI is 12%. Before doing some stats, let’s first think: why might we think that the rate of SLI is so much higher in Ammerste than it is in Belle-chère Town? 239

A simple truth about the standard error The standard error is the standard deviation of the sample mean. It grows as the sample size shrinks (s/ √ n). In practical items, this means more extreme sample means are more likely with smaller sample sizes. The smaller your sample size, the more likely you are to observe something that is very far from the population mean! This leads to a practical warning: be careful with small sample sizes. The chances of seeing something wacky or misleading can be quite high!! x 5 x 10 x 20 120 150 75 80 100 count count count 50 40 50 25 0 0 0 -2 -1 0 1 2 -2 -1 0 1 2 -2 -1 0 1 2 m m m 240

And now we can explain why we use standard error in our graphs OK, so now we know that the standard error is a measure of how much sample means from the same population will vary. So now we can use the following logic. If two sample means differ by a lot relative to the standard error, then they are either from different populations, or something relatively rare has occurred (e.g., something rare like we drew samples from two ends of the sampling distribution of the mean). Cashing this logic out quantitatively is the domain of statistics (and we will 1 ● learn some of this soon). But at least mean z − score judgment ● ● you can see why we use standard 0 errors in our figure. ● Since we are comparing means in our − 1 figures, the standard errors allow us to short long compare the size of the variability dependency length embedded structure ● non − island ● island between means. Again, the formula for the estimated standard error is standard error divided by the square root of the sample size, or s/ √ n. There is no built-in function for this in R, so it is good to memorize it. 241

A related concept: the 95% CI Another representation of the variance you will sometimes see plotted, or presented in text, is the 95% confidence interval (CI). The general formula for a 95%CI around a sample mean is: x ̄ +/- critical.value * standard.error Where the critical value comes from your statistical test. For example, if you are doing a t -test, it will be the t -value where your t -test would reach statistical significance. The 95% CI is a range that is constructed from a sample mean such that it will contain the true population mean 95% of the time. In other words, there is a 95% chance that the population mean is somewhere inside the 95% CI. However, be warned: the 95% CI is not a distribution over plausible values of the population mean… however much you might like it to be! Warning : for within-subjects designs the correct calculation of the 95%CI is a little more nuanced; see Bakeman & McArthur (1996). 242

The most important lesson in stats: Statistics is a field of study, not a tool Statistics is its own field. There is a ton to learn, and more is being discovered every day. Statisticians have different philosophies, theories, tastes, etc. They can’t tell you the “correct” theory any more than we can tell them the “correct” theory of linguistics. What we want to do is take this large and vibrant field, and convert it into a tool for us to use when we need it. This is a category mismatch. Statistics ≠ Imagine if somebody tried to do that with linguistics. We would shake our heads and walk away… But statistics is in a weird position, because other sciences do need the tools that they develop to get work done. And statistics wants to solve those problems for science. So we have to try to convert the field into a set of tools. 243

What you will run for (most) papers Obviously, I am not qualified to teach you the actual field of statistics. And there is no way to give you a complete understanding of the “tool version” of statistics that we use in experimental syntax in the time we have here. So here is my idea. I am going to start by showing you the R commands that you are going to run for (most) of your experimental syntax papers. Then we will work backwards to figure out exactly what information these commands are giving you. 1. Load the lmerTest package library(lmerTest) 2. Create a linear mixed effects model with your fixed factors (e.g., factor1 and factor2) and random factors for subjects and items. model.lmer=lmer(responseVariable~factor1*factor2 + (1+factor1*factor2| subject) + (1|item), data=yourDataset) 3. Run the anova() function to derive F statistics and p -values using the Satterthwaite approximation for degrees of freedom. anova(model.lmer) 244

The results for our data If we run the following code in the script called linear.mixed.effects.models.r: wh.lmer = lmer(zscores~embeddedStructure*dependencyLength + (1|subject) + (1|item), data=wh) And then use the summary() and anova() functions, we get the following results: summary(wh.lmer) anova(wh.lmer) In this section we want to try to understand what the model above is modeling, and what the information in the summaries is telling us. 245

Why all of this talk of populations, parameters, samples, and - PDF document

Today - A little on standard error of the mean and variation in estimates of central tendency. - A rough and ready primer on linear mixed effects models, a commonly- used tool for statistical analysis of experimentally collected data in

Lecture 6: samples and populations Todays lecture Look at fundamental concepts of samples and

Counting Words: Type probabilities Population models Type-rich populations, samples, ZM &

Samples Advertising of samples and handing out samples Advertising Education and Assurance

-Samples [AB98] Hyp: domain S is a smooth curve or surface. S 1 -Samples [AB98] Hyp:

Business Statistics CONTENTS Comparing two samples Comparing two unrelated samples Comparing

Chapter 7: Sampling In this chapter we will cover: 1. Samples and Populations ( 7.1, 7.2 Rice)

Introduction to Research Methods Samples and Populations Measuring Data Relationships Bewteen

Possum Populations Examining changes in possum populations at two different areas of Australia

Resampling Methods general problem scientific Qs are about populations we cant measure

Camera Parameters INEL 6088 Computer Vision Camera Parameters Extrinsic parameters: define

Biased and Unbiased Samples James J. Heckman Econ 312, Spring 2019 May 14, 2019 1 / 125

Biased and Unbiased Samples James J. Heckman Econ 312, Spring 2019 May 13, 2019 1 / 125

STAT 113 Independent vs. Paired Samples Colin Reimer Dawson Oberlin College November 16, 2017

Variance Estimation in Complex Samples: The Finite Population Bootstrap Using Pseudo-Populations

Two-Port Networks Definitions Impedance Parameters Admittance Parameters Hybrid

+ + Review n function parts: n parameters n no parameters n return type n multiple

Generalized Method of Moments (GMM) Estimation Heino Bohn Nielsen 1 of 32 Outline (1)

Data Analysis and Uncertainty Part 2: Estimation Instructor: Sargur N. Srihari University at

Basics of Geographic Analysis in R Spatial Regression Yuri M. Zhukov GOV 2525: Political

Using lasso and related estimators for prediction Di Liu StataCorp July 12, 2019 1 / 20

Introduction to Machine Learning CMU-10701 2. MLE, MAP, Bayes classification Barnabs Pczos

correction Linear imperfections and correction, JUAS, January 2014 Yannis PAPAPHILIPPOU

Results of June 23 Pbar LowBeta La6ce Measurement A. Valishev Tevatron Dept. Mtg. 7/2/2010

Multiple Regression Rick Balkin, Ph.D., LPC-S, NCC Department of Counseling Texas A & M

Why all of this talk of populations, parameters, samples, and - PDF document

Today - A little on standard error of the mean and variation in estimates of central tendency. - A rough and ready primer on linear mixed effects models, a commonly- used tool for statistical analysis of experimentally collected data in

Lecture 6: samples and populations Todays lecture Look at fundamental concepts of samples and

Counting Words: Type probabilities Population models Type-rich populations, samples, ZM &amp;

Samples Advertising of samples and handing out samples Advertising Education and Assurance

-Samples [AB98] Hyp: domain S is a smooth curve or surface. S 1 -Samples [AB98] Hyp:

Business Statistics CONTENTS Comparing two samples Comparing two unrelated samples Comparing

Chapter 7: Sampling In this chapter we will cover: 1. Samples and Populations ( 7.1, 7.2 Rice)

Introduction to Research Methods Samples and Populations Measuring Data Relationships Bewteen

Possum Populations Examining changes in possum populations at two different areas of Australia

Resampling Methods general problem scientific Qs are about populations we cant measure

Camera Parameters INEL 6088 Computer Vision Camera Parameters Extrinsic parameters: define

Biased and Unbiased Samples James J. Heckman Econ 312, Spring 2019 May 14, 2019 1 / 125

Biased and Unbiased Samples James J. Heckman Econ 312, Spring 2019 May 13, 2019 1 / 125

STAT 113 Independent vs. Paired Samples Colin Reimer Dawson Oberlin College November 16, 2017

Variance Estimation in Complex Samples: The Finite Population Bootstrap Using Pseudo-Populations

Two-Port Networks Definitions Impedance Parameters Admittance Parameters Hybrid

+ + Review n function parts: n parameters n no parameters n return type n multiple

Generalized Method of Moments (GMM) Estimation Heino Bohn Nielsen 1 of 32 Outline (1)

Data Analysis and Uncertainty Part 2: Estimation Instructor: Sargur N. Srihari University at

Basics of Geographic Analysis in R Spatial Regression Yuri M. Zhukov GOV 2525: Political

Using lasso and related estimators for prediction Di Liu StataCorp July 12, 2019 1 / 20

Introduction to Machine Learning CMU-10701 2. MLE, MAP, Bayes classification Barnabs Pczos

correction Linear imperfections and correction, JUAS, January 2014 Yannis PAPAPHILIPPOU

Results of June 23 Pbar LowBeta La6ce Measurement A. Valishev Tevatron Dept. Mtg. 7/2/2010

Multiple Regression Rick Balkin, Ph.D., LPC-S, NCC Department of Counseling Texas A &amp; M

Counting Words: Type probabilities Population models Type-rich populations, samples, ZM &

Multiple Regression Rick Balkin, Ph.D., LPC-S, NCC Department of Counseling Texas A & M