UQ, STAT2201, 2017, Lecture 7. Unit 7 Single Sample Inference. 1 - PowerPoint PPT Presentation

UQ, STAT2201, 2017, Lecture 7. Unit 7 – Single Sample Inference. 1

Setup: A sample x 1 , . . . , x n (collected values). Model: An i.i.d. sequence of random variables, X 1 , . . . , X n . Parameter at question: The population mean, µ = E [ X i ]. Point estimate: x (described by the random variable X ). 2

Goal: Devise hypothesis tests and confidence intervals for µ . Distinguish between the two cases: Unrealistic (but simpler): The population variance, σ 2 , is known. More realistic: The variance is not known and estimated by the sample variance, s 2 . 3

For very small samples, the results we present are valid only if the population is normally distributed. But for non-small samples (e.g. n > 20, although there isn’t a clear rule), the central limit theorem provides a good approximation and the results are approximately correct. 4

Testing Hypotheses on the Mean, Variance Known (Z-Tests) i . i . d . with µ unknown but σ 2 known. N ( µ, σ 2 ) Model: ∼ X i Null hypothesis: H 0 : µ = µ 0 . x − µ 0 X − µ 0 Test statistic: z = , Z = . σ/ √ n σ/ √ n Alternative P -value Rejection Criterion Hypotheses for Fixed-Level Tests � � �� H 1 : µ � = µ 0 P = 2 1 − Φ | z | z > z 1 − α/ 2 or z < z α/ 2 � � H 1 : µ > µ 0 P = 1 − Φ z z > z 1 − α � � H 1 : µ < µ 0 P = Φ z z < z α 5

For H 1 : µ � = µ 0 , a procedure identical to the preceding fixed significance level test is: Reject H 0 : µ = µ 0 if either x < a or x > b where σ σ a = µ 0 − z 1 − α/ 2 and b = µ 0 + z 1 − α/ 2 √ n . √ n Compare with the confidence interval formula: σ σ √ n ≤ µ ≤ x + z 1 − α/ 2 √ n . x − z 1 − α/ 2 6

If H 0 is not true and H 1 holds with a specific value of µ = µ 1 , then it is possible to compute the probability of type II error, β . 7

In the (very realistic) case where σ 2 is not known, but rather estimated by S 2 , we would like to replace the test statistic, Z , above with, T = X − µ 0 S / √ n , but in general, T no longer follows a Normal distribution. 8

Under H 0 : µ = µ 0 , and for moderate or large samples (e.g. n > 100) this statistic is approximately Normally distributed just like above. In this case, the procedures above work well. 9

But for smaller samples, the distribution of T is no longer Normally distributed. Nevertheless, it follows a well known and very famous distribution of classical statistics: The Student-t Distribution . The probability density function of a Student-t Distribution with a parameter k , referred to as degrees of freedom , is, � � f ( x ) = Γ ( k + 1) / 2 1 √ · − ∞ < x < ∞ , � ( k +1) / 2 π k Γ( k / 2) �� x 2 / k � + 1 where Γ( · ) is the Gamma-function. It is a symmetric distribution about 0 and as k → ∞ it approaches a standard Normal distribution. 10

Why is the t-distribution so useful in (small sample) elementary statistics? Claim: Let X 1 , X 2 , . . . , X n be an i.i.d. sample from a Normal distribution with mean µ and variance σ 2 . The random variable, T has a t distribution with n − 1 degrees of freedom. 11

Knowing the distribution of T (and noticing it depends on the sample size, n ), allows to construct hypothesis tests and confidence intervals when σ 2 is not known. The construction is analogous to the Z-tests and confidence intervals. 12

If x and s are the mean and standard deviation of a random sample from a normal distribution with unknown variance σ 2 , a 100(1 − α )% confidence interval on µ is given by s s √ n ≤ µ ≤ x + t 1 − α/ 2 , n − 1 √ n x − t 1 − α/ 2 , n − 1 where t 1 − α/ 2 , n − 1 is the 1 − α/ 2 quantile of the t distribution with n − 1 degrees of freedom. 13

A related concept is a 100(1 − α )% prediction interval (PI) on a single future observation from a normal distribution is given by � � 1 + 1 1 + 1 n ≤ X n +1 ≤ x + t 1 − α/ 2 , n − 1 s n . x − t 1 − α/ 2 , n − 1 s This is the range where we expect the n + 1 observation to be, after observing n observations and computing x and s . 14

Testing Hypotheses on the Mean, Variance Unknown (T-Tests) i . i . d . with both µ and σ 2 unknown N ( µ, σ 2 ) Model: ∼ X i Null hypothesis: H 0 : µ = µ 0 . x − µ 0 X − µ 0 Test statistic: t = , T = . s / √ n S / √ n Alternative P -value Rejection Criterion Hypotheses for Fixed-Level Tests � � �� H 1 : µ � = µ 0 P = 2 1 − F n − 1 | t | t > t 1 − α/ 2 , n − 1 or t < t α/ 2 , n − 1 � � H 1 : µ > µ 0 P = 1 − F n − 1 t t > t 1 − α, n − 1 � � H 1 : µ < µ 0 P = F n − 1 t t < t α, n − 1 15

In the P-value calculation, F n − 1 ( · ) denotes the CDF of the t-distribution with n − 1 degrees of freedom. As opposed to Φ( · ), the CDF of t is not tabulated in standard tables. So to calculate P-values, we use software (or make educated guesses using quantiles). 16

UQ, STAT2201, 2017, Lecture 7. Unit 7 Single Sample Inference. 1 - PowerPoint PPT Presentation

UQ, STAT2201, 2017, Lecture 7. Unit 7 Single Sample Inference. 1 Setup: A sample x 1 , . . . , x n (collected values). Model: An i.i.d. sequence of random variables, X 1 , . . . , X n . Parameter at question: The population mean, = E [ X i

UQ, STAT2201, 2017, Lecture 8 (and part of 9). Unit 8 Two Sample Inference. Unit 9

10 slides that always work Simple text boxes (I) Sample text Sample text Sample text

UQ, STAT2201, 2017, Lecture 6 Unit 6 Statistical Inference Ideas. 1 Statistical Inference is

UQ, STAT2201, 2017, Lecture 5 Unit 4 Joint Distributions and Unit 5 Descriptive

UQ, STAT2201, 2017, Lecture 2, Unit 2, Probability and Monte Carlo. 1 Im willing to bet that

STAT2201 Analysis of Engineering & Scientific Data Unit 8 Slava Vaisman The University of

STAT2201 Analysis of Engineering & Scientific Data Unit 3 Slava Vaisman The University of

UQ, STAT2201, 2017, Lecture 9. Unit 10 Further Stats Overview 1 The Strength of Conditional

STAT2201 Analysis of Engineering & Scientific Data Unit 7 Slava Vaisman The University of

STAT2201 Analysis of Engineering & Scientific Data Unit 6 Slava Vaisman The University of

Sample 2 Inlet in western (Sunset) Bay 0 Sample 3 Inlet behind Christian Island 1 Sample

Lecture 22/Chapter 19 Part 4. Statistical Inference Ch. 19 Diversity of Sample Proportions

UQ, STAT2201, 2017, Lectures 3 and 4 Unit 3 Probability Distributions. 1 Random Variables

Unit 5: Inference for categorical variables Lecture 1: Inference for proportions Statistics 101

Agglomeration of Ash Particles due to Flue Gas Conditioning (a) Sample CA8S12F1 (b) Sample

Unit 5: Inference for categorical variables Lecture 2: Inference for 2-sample proportions

Effective Affordable Rental Housing Programs including Combining CDBG-DR with LIHTCs 2020

L-estimators, R-estimators, Redescending M gr. Jakub Petr asek Estimators Revision Seminar

Drawing from distributions Michel Bierlaire michel.bierlaire@epfl.ch Transport and Mobility

Rate Measurement Test Protocol Problem Statement Al Morton Nov 2014

STAT 113 Analytic Inference for a Single Proportion Colin Reimer Dawson Oberlin College 7-10

Applied Political Research Session 5: Tests of Hypotheses The Student t-test Lecturer: Prof. A.

Important note on t-tests Shravan Vasishth Universit at Potsdam vasishth@uni-potsdam.de

Applied Statistical Analysis EDUC 6050 Week 6 Finding clarity using data Today 1. Hypothesis

Sambuz

Useful Links

Newsletter

Mail Us

UQ, STAT2201, 2017, Lecture 7. Unit 7 Single Sample Inference. 1 - PowerPoint PPT Presentation

UQ, STAT2201, 2017, Lecture 7. Unit 7 Single Sample Inference. 1 Setup: A sample x 1 , . . . , x n (collected values). Model: An i.i.d. sequence of random variables, X 1 , . . . , X n . Parameter at question: The population mean, = E [ X i

UQ, STAT2201, 2017, Lecture 8 (and part of 9). Unit 8 Two Sample Inference. Unit 9

10 slides that always work Simple text boxes (I) Sample text Sample text Sample text

UQ, STAT2201, 2017, Lecture 6 Unit 6 Statistical Inference Ideas. 1 Statistical Inference is

UQ, STAT2201, 2017, Lecture 5 Unit 4 Joint Distributions and Unit 5 Descriptive

UQ, STAT2201, 2017, Lecture 2, Unit 2, Probability and Monte Carlo. 1 Im willing to bet that

STAT2201 Analysis of Engineering &amp; Scientific Data Unit 8 Slava Vaisman The University of

STAT2201 Analysis of Engineering &amp; Scientific Data Unit 3 Slava Vaisman The University of

UQ, STAT2201, 2017, Lecture 9. Unit 10 Further Stats Overview 1 The Strength of Conditional

STAT2201 Analysis of Engineering &amp; Scientific Data Unit 7 Slava Vaisman The University of

STAT2201 Analysis of Engineering &amp; Scientific Data Unit 6 Slava Vaisman The University of

Sample 2 Inlet in western (Sunset) Bay 0 Sample 3 Inlet behind Christian Island 1 Sample

Lecture 22/Chapter 19 Part 4. Statistical Inference Ch. 19 Diversity of Sample Proportions

UQ, STAT2201, 2017, Lectures 3 and 4 Unit 3 Probability Distributions. 1 Random Variables

Unit 5: Inference for categorical variables Lecture 1: Inference for proportions Statistics 101

Agglomeration of Ash Particles due to Flue Gas Conditioning (a) Sample CA8S12F1 (b) Sample

Unit 5: Inference for categorical variables Lecture 2: Inference for 2-sample proportions

Effective Affordable Rental Housing Programs including Combining CDBG-DR with LIHTCs 2020

L-estimators, R-estimators, Redescending M gr. Jakub Petr asek Estimators Revision Seminar

Drawing from distributions Michel Bierlaire michel.bierlaire@epfl.ch Transport and Mobility

Rate Measurement Test Protocol Problem Statement Al Morton Nov 2014

STAT 113 Analytic Inference for a Single Proportion Colin Reimer Dawson Oberlin College 7-10

Applied Political Research Session 5: Tests of Hypotheses The Student t-test Lecturer: Prof. A.

Important note on t-tests Shravan Vasishth Universit at Potsdam vasishth@uni-potsdam.de

Applied Statistical Analysis EDUC 6050 Week 6 Finding clarity using data Today 1. Hypothesis

Sambuz

Useful Links

Newsletter

Mail Us

STAT2201 Analysis of Engineering & Scientific Data Unit 8 Slava Vaisman The University of

STAT2201 Analysis of Engineering & Scientific Data Unit 3 Slava Vaisman The University of

STAT2201 Analysis of Engineering & Scientific Data Unit 7 Slava Vaisman The University of

STAT2201 Analysis of Engineering & Scientific Data Unit 6 Slava Vaisman The University of