STAT2201 Analysis of Engineering & Scientific Data Unit 8 - PowerPoint PPT Presentation

STAT2201 Analysis of Engineering & Scientific Data Unit 8 Slava Vaisman The University of Queensland School of Mathematics and Physics

Two Sample Inference ◮ This time, we consider two different samples. x 1 , . . . , x n 1 , y 1 , . . . , y n 2 . ◮ These samples are modeled as an i.i.d. sequence of random variables X 1 , . . . , X n 1 , Y 1 , . . . , Y n 2 . ◮ The n 1 is not necessarily equal to the n 2 . ◮ We model { X i } 1 ≤ i ≤ n 1 and { Y i } 1 ≤ i ≤ n 2 with µ 1 , σ 2 µ 2 , σ 2 � � � � X i ∼ N Y i ∼ N , , 1 2 ◮ and distinguish between the following cases: � σ 2 1 = σ 2 2 = σ 2 , equal variances: σ 2 1 � = σ 2 unequal variances: 2 .

Medical treatment Recall experimental medical treatment example, in which 14 subjects were randomly assigned to control or treatment group. The survival times (in days) are shown in the table below. Data Mean Treatment group 91, 140, 16, 32, 101, 138, 24 77.428 Control group 3, 115, 8, 45, 102, 12 47.5 We asked: ◮ Did the treatment prolong the survival? ◮ Is the observed result significant , or due to a chance ? Note that we are dealing with two samples: x 1 , . . . , x 7 and y 1 , . . . , y 6 . Note that n 1 = 7 and n 2 = 6.

Inference Data Mean Treatment group 91, 140, 16, 32, 101, 138, 24 77.428 Control group 3, 115, 8, 45, 102, 12 47.5 ◮ We could carry single sample inference for each population separately. Namely, for: µ 1 = E [ X i ] , and µ 2 = E [ Y i ] . ◮ However, we are generally more interested to know if the treatment helps (prolongs the survival time). ◮ Specifically, we focus on the difference in means: ∆ µ = µ 1 − µ 2 = E [ X i ] − E [ Y i ] .

Inference ◮ For ∆ µ = µ 1 − µ 2 = E [ X i ] − E [ Y i ], we can carry out inference jointly. ◮ Specifically, it is common to examine: 1. ∆ µ > 0 ⇒ µ 1 > µ 2 , or 2. ∆ µ < 0 ⇒ µ 1 < µ 2 , or 3. ∆ µ = 0 ⇒ µ 1 = µ 2 . ◮ We can also replace the zero with some ∆ 0 to get: 1. ∆ µ > ∆ 0 ⇒ µ 1 − µ 2 > ∆ 0 , or 2. ∆ µ < ∆ 0 ⇒ µ 1 − µ 2 < ∆ 0 , or 3. ∆ µ = ∆ 0 ⇒ µ 1 − µ 2 = ∆ 0 .

A point estimator for ∆ µ ◮ A point estimator for ∆ µ is given by: X − Y , where X and Y are sample means. ◮ The estimate from the data is given by x − y , where n 1 x = 1 � x i , n 1 i =1 and n 2 y = 1 � y i . n 2 i =1

Estimating the variances Point estimates for σ 2 1 and σ 2 2 are the individual sample variances: n 1 n 2 1 1 s 2 � ( x i − x ) 2 , s 2 � ( y i − y ) 2 . 1 = 2 = (1) n 1 − 1 n 2 − 1 i =1 i =1 1. Equal variances : note that both s 2 1 and s 2 2 estimate σ 2 . The so called pooled variance estimator can be obtained via: p = ( n 1 − 1) s 2 1 + ( n 2 − 1) s 2 s 2 2 . n 1 + n 2 − 2 2. Unequal variances : just use (1) to obtain point estimates for σ 2 1 and σ 2 2 .

The test statistic Note that: ◮ E � � � � � � X − Y = E X − E Y = ∆ 0 ◮ The variance is: + ( − 1) 2 Var � � � � � � � � X − Y = Var X + ( − 1) Y = Var X Y Var � � � � = Var X + Var Y . This leads to the following test statistic T defined via (note the similarity to the one-sample tests we discussed): T = X − Y − ∆ 0 . � s 2 n 1 + s 2 1 2 n 2

The test statistic We consider the statistic T = X − Y − ∆ 0 , � s 2 n 1 + s 2 1 2 n 2 under equal/unequal variance setting. ◮ Equal variances : T = X − Y − ∆ 0 = X − Y − ∆ 0 , � � s 2 s 2 n 1 + 1 1 p p s p n 1 + n 2 n 2 ◮ Unequal variances : T = X − Y − ∆ 0 , � s 2 n 1 + s 2 1 2 n 2

Equal variances In the equal variance case, under H 0 it holds (approximately): T = X − Y − ∆ 0 ∼ t ( n 1 + n 2 − 2) . � n 1 + 1 1 s p n 2 That is, the T test statistic follows a t-distribution with n 1 + n 2 − 2 degrees of freedom.

Unequal variances In the unequal variance case, under H 0 it holds (approximately): T = X − Y − ∆ 0 ∼ t ( ν ) , � s 2 n 1 + s 2 1 2 n 2 where � s 2 � 2 n 1 + s 2 1 2 n 2 ν = ( s 2 + ( s 2 1 / n 1 ) 2 2 / n 2 ) 2 n 1 − 1 n 2 − 1 If ν is not an integer, may round down to the nearest integer (if we would like to use the table). That is, the T test statistic follows a t-distribution with ν degrees of freedom.

Two sample t -test with equal variance

Two sample t -test with unequal variance

1 − α Confidence Intervals 1. Equal variance case: � � 1 + 1 � µ 1 − µ 2 ∈ x − y ± t 1 − α/ 2 , n 1 + n 2 − 2 s p n 1 n 2 2. Unequal variance case:   � s 2 + s 2 1 2 µ 1 − µ 2 ∈  x − y ± t 1 − α/ 2 ,ν  n 1 n 2

t -test example Treatment = [91, 140, 16, 32, 101, 138, 24] Control = [3, 115, 8, 45, 102, 12 ] UnequalVarianceTTest(Treatment,Control) Output: Two sample t-test (unequal variance) ------------------------------------ Population details: parameter of interest: Mean difference value under h_0: 0 point estimate: 29.92857142857143 95% confidence interval: (-33.0286, 92.8857) Test summary: outcome with 95% confidence: fail to reject h_0 two-sided p-value: 0.3175326630084628 Details: number of observations: [7,6] t-statistic: 1.0475473589407192 degrees of freedom: 10.89399347312799 empirical standard error: 28.570136875563534

STAT2201 Analysis of Engineering & Scientific Data Unit 8 - PowerPoint PPT Presentation

STAT2201 Analysis of Engineering & Scientific Data Unit 8 Slava Vaisman The University of Queensland School of Mathematics and Physics Two Sample Inference This time, we consider two different samples. x 1 , . . . , x n 1 , y 1 , . . .

STAT2201 Analysis of Engineering & Scientific Data Unit 3 Slava Vaisman The University of

STAT2201 Analysis of Engineering & Scientific Data Unit 7 Slava Vaisman The University of

STAT2201 Analysis of Engineering & Scientific Data Unit 6 Slava Vaisman The University of

UQ, STAT2201, 2017, Lecture 5 Unit 4 Joint Distributions and Unit 5 Descriptive

UQ, STAT2201, 2017, Lecture 8 (and part of 9). Unit 8 Two Sample Inference. Unit 9

UQ, STAT2201, 2017, Lecture 6 Unit 6 Statistical Inference Ideas. 1 Statistical Inference is

UQ, STAT2201, 2017, Lecture 9. Unit 10 Further Stats Overview 1 The Strength of Conditional

UQ, STAT2201, 2017, Lectures 3 and 4 Unit 3 Probability Distributions. 1 Random Variables

UQ, STAT2201, 2017, Lecture 2, Unit 2, Probability and Monte Carlo. 1 Im willing to bet that

UQ, STAT2201, 2017, Lecture 7. Unit 7 Single Sample Inference. 1 Setup: A sample x 1 , . . .

HOUSING PROJECT 1 UNIT 4 UNIT 1 UNIT 6 UNIT 5 UNIT 3 UNIT 2 Application of the Concept

Unit Identifier Unit October 21, 2014 Unit Identifiers Unit Members Representing Name Email

Unit Title: Presentation Software Unit Level: 2 Unit Credit Value: 4 GLH: 30 LASER Unit

The Scientific Method The Scientific Method The Scientific Method involves 6 steps: Problem

Scientific report Mariusz ynel April 22, 2015 Scientific report 2 Contents 1 Scientific

Efficient Scientific Data Efficient Scientific Data Management on Supercomputers Management on

STAT 113: TOPIC OUTLINE (FINAL EXAM) COLIN REIMER DAWSON, FALL 2015 The final exam will cover the

Deep Learning: Part 2 Graduate School of Culture Technology, KAIST Juhan Nam Outlines

Where Are We? Lecture 9 Robustness through Training 1 Robustness Explicit Handling of Noise

Discrete Mathematics & Mathematical Reasoning Algorithms Colin Stirling Informatics Some

Bayesian Models for Combining Data Across Subjects and Studies in Predictive fMRI Data Analysis

STK-IN4300 Details of Random Forests Statistical Learning Methods in Data Science Adaptive

Analysis Toolpack on a Mac It seems Excel has done away with the Analysis Toolpack on Macs They

Selecting Statistics the Most Representative How to describe closeness Formulation of the . . .

Sambuz

Useful Links

Newsletter

Mail Us

STAT2201 Analysis of Engineering & Scientific Data Unit 8 - PowerPoint PPT Presentation

STAT2201 Analysis of Engineering & Scientific Data Unit 8 Slava Vaisman The University of Queensland School of Mathematics and Physics Two Sample Inference This time, we consider two different samples. x 1 , . . . , x n 1 , y 1 , . . .

STAT2201 Analysis of Engineering &amp; Scientific Data Unit 3 Slava Vaisman The University of

STAT2201 Analysis of Engineering &amp; Scientific Data Unit 7 Slava Vaisman The University of

STAT2201 Analysis of Engineering &amp; Scientific Data Unit 6 Slava Vaisman The University of

UQ, STAT2201, 2017, Lecture 5 Unit 4 Joint Distributions and Unit 5 Descriptive

UQ, STAT2201, 2017, Lecture 8 (and part of 9). Unit 8 Two Sample Inference. Unit 9

UQ, STAT2201, 2017, Lecture 6 Unit 6 Statistical Inference Ideas. 1 Statistical Inference is

UQ, STAT2201, 2017, Lecture 9. Unit 10 Further Stats Overview 1 The Strength of Conditional

UQ, STAT2201, 2017, Lectures 3 and 4 Unit 3 Probability Distributions. 1 Random Variables

UQ, STAT2201, 2017, Lecture 2, Unit 2, Probability and Monte Carlo. 1 Im willing to bet that

UQ, STAT2201, 2017, Lecture 7. Unit 7 Single Sample Inference. 1 Setup: A sample x 1 , . . .

HOUSING PROJECT 1 UNIT 4 UNIT 1 UNIT 6 UNIT 5 UNIT 3 UNIT 2 Application of the Concept

Unit Identifier Unit October 21, 2014 Unit Identifiers Unit Members Representing Name Email

Unit Title: Presentation Software Unit Level: 2 Unit Credit Value: 4 GLH: 30 LASER Unit

The Scientific Method The Scientific Method The Scientific Method involves 6 steps: Problem

Scientific report Mariusz ynel April 22, 2015 Scientific report 2 Contents 1 Scientific

Efficient Scientific Data Efficient Scientific Data Management on Supercomputers Management on

STAT 113: TOPIC OUTLINE (FINAL EXAM) COLIN REIMER DAWSON, FALL 2015 The final exam will cover the

Deep Learning: Part 2 Graduate School of Culture Technology, KAIST Juhan Nam Outlines

Where Are We? Lecture 9 Robustness through Training 1 Robustness Explicit Handling of Noise

Discrete Mathematics &amp; Mathematical Reasoning Algorithms Colin Stirling Informatics Some

Bayesian Models for Combining Data Across Subjects and Studies in Predictive fMRI Data Analysis

STK-IN4300 Details of Random Forests Statistical Learning Methods in Data Science Adaptive

Analysis Toolpack on a Mac It seems Excel has done away with the Analysis Toolpack on Macs They

Selecting Statistics the Most Representative How to describe closeness Formulation of the . . .

Sambuz

Useful Links

Newsletter

Mail Us

STAT2201 Analysis of Engineering & Scientific Data Unit 3 Slava Vaisman The University of

STAT2201 Analysis of Engineering & Scientific Data Unit 7 Slava Vaisman The University of

STAT2201 Analysis of Engineering & Scientific Data Unit 6 Slava Vaisman The University of

Discrete Mathematics & Mathematical Reasoning Algorithms Colin Stirling Informatics Some