gov 51 expectation variance and sample means
play

Gov 51: Expectation, Variance, and Sample Means Matthew Blackwell - PowerPoint PPT Presentation

Gov 51: Expectation, Variance, and Sample Means Matthew Blackwell Harvard University 1 / 13 Remember our goal Population Sample probability inference We want to learn about the chance process that generated our data. Last time:


  1. Gov 51: Expectation, Variance, and Sample Means Matthew Blackwell Harvard University 1 / 13

  2. Remember our goal Population Sample probability inference • We want to learn about the chance process that generated our data. • Last time: entire probability distributions. Is there something simpler? 2 / 13

  3. How can we summarize distributions? • Two numerical summaries of the distribution are useful. 1. Mean/expectaion : where the center of the distribution is. 2. Variance/standard deviation : how spread out the distribution is around the center. • These are population parameters so we don’t get to observe them. • We won’t get to observe them… • but we’ll use our sample to learn about them 3 / 13

  4. Two ways to calculate averages • Calculate the average of: { 𝟤 , 𝟤 , 𝟤 , 𝟦 , 𝟧 , 𝟧 , 𝟨 , 𝟨 } 𝟤 + 𝟤 + 𝟤 + 𝟦 + 𝟧 + 𝟧 + 𝟨 + 𝟨 𝟫 = 𝟦 • Alternative way to calculate average based on frequency weights : • Each value times how often that value occurs in the data. • We’ll use this intuition to create an average/mean for r.v.s. 4 / 13 𝟤 × 𝟦 𝟫 + 𝟦 × 𝟤 𝟫 + 𝟧 × 𝟥 𝟫 + 𝟨 × 𝟥 𝟫 = 𝟦

  5. Expectation • We write 𝔽( 𝘠 ) for the mean of an r.v. 𝘠 . • For discrete 𝘠 ∈ { 𝘺 𝟤 , 𝘺 𝟥 , … , 𝘺 𝘭 } with 𝘭 levels: 𝔽[ 𝘠 ] = 𝘭 ∑ 𝘬 = 𝟤 𝘺 𝘬 ℙ( 𝘠 = 𝘺 𝘬 ) • Weighted average of the values of the r.v. weighted by the probability of each value occurring. • If 𝘠 is age of randomly selected registered voter, then 𝔽( 𝘠 ) is the average age in the population of registered voters. • Notation notes: • Lots of other ways to refer to this: expectation or expected value • Often called the population mean to distinguish from the sample mean. 5 / 13

  6. Properties of the expected value • We use properties of 𝔽( 𝘠 ) to avoid using the formula every time. • Let 𝘠 and 𝘡 be r.v.s and 𝘣 and 𝘤 be constants. 1. 𝔽( 𝘣 ) = 𝘣 • Constants don’t vary. 2. 𝔽( 𝘣𝘠 ) = 𝘣 𝔽( 𝘠 ) • Suppose 𝘠 is income in dollars, income in $10k is just: 𝘠 / 𝟤𝟣𝟣𝟣𝟣 • Mean of this new variable is mean of income in dollars divided by 10,000. 3. 𝔽( 𝘣𝘠 + 𝘤𝘡 ) = 𝘣 𝔽( 𝘠 ) + 𝘤 𝔽( 𝘡 ) • Expectations can be distributed across sums. • 𝘠 is partner 1’s income, 𝘡 is partner 2’s income. • Mean household income is the sum of the each partner’s income. 6 / 13

  7. Variance • The variance measures the spread of the distribution: 𝕎[ 𝘠 ] = 𝔽[( 𝘠 − 𝔽[ 𝘠 ]) 𝟥 ] • Weighted average of the squared distances from the mean. • If 𝘠 is the age of a randomly selected registered voter, 𝕎[ 𝘠 ] is the usual sample variance of age in the population. • Sometimes called population variance to contrast with sample variance. • Useful because it’s on the scale of the original variable. 7 / 13 • Larger deviations ( + or − ) ⇝ higher variance • Standard deviation : square root of the variance: 𝘛𝘌 ( 𝘠 ) = √𝕎[ 𝘠 ] .

  8. Properties of variances • Some properties of variance useful for calculation. 1. If 𝘤 is a constant, then 𝕎[ 𝘤 ] = 𝟣 . 2. If 𝘣 and 𝘤 are constants, 𝕎[ 𝘣𝘠 + 𝘤 ] = 𝘣 𝟥 𝕎[ 𝘠 ]. 3. In general, 𝕎[ 𝘠 + 𝘡 ] ≠ 𝕎[ 𝘠 ] + 𝕎[ 𝘡 ] . • If 𝘠 and 𝘡 are independent, then 𝕎[ 𝘠 + 𝘡 ] = 𝕎[ 𝘠 ] + 𝕎[ 𝘡 ] 8 / 13

  9. Sums and means are random variables • The sample mean is a function of sums and so it is a r.v. too: 𝟥 • Example: the average age of two randomly selected respondents. 9 / 13 • If 𝘠 𝟤 and 𝘠 𝟥 are r.v.s, then 𝘠 𝟤 + 𝘠 𝟥 is a r.v. • Has a mean 𝔽[ 𝘠 𝟤 + 𝘠 𝟥 ] and a variance 𝕎[ 𝘠 𝟤 + 𝘠 𝟥 ] 𝘠 = 𝘠 𝟤 + 𝘠 𝟥

  10. Distribution of sums/means ⋮ draw 4 68 28 96 48 ⋮ ⋮ 82 ⋮ ⋮ distribution of the sum distribution of the mean 41 48 𝘠 𝟤 76 𝘠 𝟥 𝘠 draw 1 44 32 38 34 draw 2 27 50 77 38.5 draw 3 10 / 13 𝘠 𝟤 + 𝘠 𝟥

  11. Independent and identical r.v.s • Independent and identically distributed r.v.s, 𝘠 𝟤 , … , 𝘠 𝘰 • Random sample of 𝘰 respondents on a survey question. • Written “i.i.d.” • 𝔽( 𝘠 𝟤 ) = 𝔽( 𝘠 𝟥 ) = ⋯ = 𝔽( 𝘠 𝘰 ) = 𝜈 11 / 13 • Independent : value that 𝘠 𝘫 takes doesn’t afgect distribution of 𝘠 𝘬 • Identically distributed : distribution of 𝘠 𝘫 is the same for all 𝘫 • 𝕎( 𝘠 𝟤 ) = 𝕎( 𝘠 𝟥 ) = ⋯ = 𝕎( 𝘠 𝘰 ) = 𝜏 𝟥

  12. Distribution of the sample mean • Sample mean of i.i.d. random variables: 𝘰 12 / 13 𝘠 𝘰 = 𝘠 𝟤 + 𝘠 𝟥 + ⋯ + 𝘠 𝘰 • 𝘠 𝘰 is a random variable, what is its distribution? • What is the expectation of this distribution, 𝔽[ 𝘠 𝘰 ] ? • What is the variance of this distribution, 𝕎[ 𝘠 𝘰 ] ?

  13. Properties of the sample mean Mean and variance of the sample mean 𝘰 • Key insights: • Sample mean is on average equal to the population mean sample size • Standard deviation of the sample mean is called its standard error : 𝜏 √ 𝘰 13 / 13 Suppose that 𝘠 𝟤 , … , 𝘠 𝘰 are i.i.d. r.v.s with 𝔽[ 𝘠 𝘫 ] = 𝜈 and 𝕎[ 𝘠 𝘫 ] = 𝜏 𝟥 . Then: 𝕎[ 𝘠 𝘰 ] = 𝜏 𝟥 𝔽[ 𝘠 𝘰 ] = 𝜈 • Variance of 𝘠 𝘰 depends on the population variance of 𝘠 𝘫 and the 𝘛𝘍 = √𝕎[ 𝘠 𝘰 ] =

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend