Confidence intervals and power Applied Statistics and Experimental - PowerPoint PPT Presentation

Confidence intervals Power and Sample Size Determination Main property of a confidence interval Suppose you are going to 1. gather data; 2. compute a 100 × (1 − α )% confidence interval. Further suppose H 0 : E [ Y ] = µ 0 is true. What is the probability that µ 0 will be in your to-be-sampled (random) interval? What is the probability that the random interval will contain the true value ? Pr( µ 0 in interval | E [ Y ] = µ 0 ) = 1 − Pr( µ 0 not in interval | E [ Y ] = µ 0 ) The quantity 1 − α is called the coverage probability . It is • the pre-experimental probability that your confidence interval will cover the true value; • the large sample fraction of experiments in which the confidence interval covers the true mean.

Confidence intervals Power and Sample Size Determination Main property of a confidence interval Suppose you are going to 1. gather data; 2. compute a 100 × (1 − α )% confidence interval. Further suppose H 0 : E [ Y ] = µ 0 is true. What is the probability that µ 0 will be in your to-be-sampled (random) interval? What is the probability that the random interval will contain the true value ? Pr( µ 0 in interval | E [ Y ] = µ 0 ) = 1 − Pr( µ 0 not in interval | E [ Y ] = µ 0 ) = 1 − Pr(reject H 0 | E [ Y ] = µ 0 ) The quantity 1 − α is called the coverage probability . It is • the pre-experimental probability that your confidence interval will cover the true value; • the large sample fraction of experiments in which the confidence interval covers the true mean.

Confidence intervals Power and Sample Size Determination Main property of a confidence interval Suppose you are going to 1. gather data; 2. compute a 100 × (1 − α )% confidence interval. Further suppose H 0 : E [ Y ] = µ 0 is true. What is the probability that µ 0 will be in your to-be-sampled (random) interval? What is the probability that the random interval will contain the true value ? Pr( µ 0 in interval | E [ Y ] = µ 0 ) = 1 − Pr( µ 0 not in interval | E [ Y ] = µ 0 ) = 1 − Pr(reject H 0 | E [ Y ] = µ 0 ) = 1 − Pr(reject H 0 | H 0 is true) The quantity 1 − α is called the coverage probability . It is • the pre-experimental probability that your confidence interval will cover the true value; • the large sample fraction of experiments in which the confidence interval covers the true mean.

Confidence intervals Power and Sample Size Determination Main property of a confidence interval Suppose you are going to 1. gather data; 2. compute a 100 × (1 − α )% confidence interval. Further suppose H 0 : E [ Y ] = µ 0 is true. What is the probability that µ 0 will be in your to-be-sampled (random) interval? What is the probability that the random interval will contain the true value ? Pr( µ 0 in interval | E [ Y ] = µ 0 ) = 1 − Pr( µ 0 not in interval | E [ Y ] = µ 0 ) = 1 − Pr(reject H 0 | E [ Y ] = µ 0 ) = 1 − Pr(reject H 0 | H 0 is true) = 1 − α The quantity 1 − α is called the coverage probability . It is • the pre-experimental probability that your confidence interval will cover the true value; • the large sample fraction of experiments in which the confidence interval covers the true mean.

Confidence intervals Power and Sample Size Determination Confidence interval for a difference between treatments In general, we may construct a 95% confidence interval by finding those null hypotheses that would not be rejected at the 0 . 05 level. Sampling model: Y 1 , A , . . . , Y n A , A ∼ i.i.d. normal( µ A , σ 2 ) Y 1 , B , . . . , Y n B , B ∼ i.i.d. normal( µ B , σ 2 ). Consider evaluating whether δ is a reasonable value for the difference in means: H 0 : µ B − µ A = δ H 1 : µ B − µ A � = δ Under H 0 , you should be able to show that ( ¯ Y B − ¯ Y A ) − δ ∼ t n A + n B − 2 � 1 / n A + 1 / n B s p Thus a given difference δ is accepted at level α if | ¯ y B − ¯ y A − δ | ≤ t c � s p 1 / n A + 1 / n B � � n A + 1 1 n A + 1 1 (¯ y B − ¯ y A ) − s p n B t c ≤ δ ≤ (¯ y B − ¯ y A ) + s p n B t c where t c = t 1 − α/ 2 , n A + n B − 2 is the critical value .

Confidence intervals Power and Sample Size Determination Wheat example: • ¯ y B − ¯ y A = 5 . 93 � • s p = 4 . 72, s p 1 / n A + 1 / n B = 2 . 72 • t . 975 , 10 = 2 . 23 A 95% C.I. for µ B − µ A is 5 . 93 ± 2 . 72 × 2 . 23 5 . 93 ± 6 . 07 = ( − 0 . 13 , 11 . 99) Questions: • What does the fact that 0 is in the interval say about H 0 : µ A = µ B ? • What is the interpretation of this interval? • Could we have constructed an interval via a randomization test?

Confidence intervals Power and Sample Size Determination Simulation study To be clear about the notion of coverage probabiliy, lets perform a small simulation study: muA < − 19 ; muB < − 25 ; sig2 < − 23 nA < − nB < − 6 CI < − NULL f o r ( s i n 1:100) { yA < − rnorm (nA ,muA, s q r t ( s i g 2 )) yB < − rnorm (nB ,muB, s q r t ( s i g 2 )) CI < − r b i n d ( CI , t . t e s t (yB , yA , var . equal=TRUE) $conf . i n t ) } In this simulation, • The data are from two normal populations with a common variance • The true difference in means is 6

Confidence intervals Power and Sample Size Determination Simulation study

Confidence intervals Power and Sample Size Determination Power and Sample Size Determination Study design: Gather data on two groups, decide if there is a difference. A conclusion will be made based on a level- α two-sample t -test. Two sample experiment and t -test: • H 0 : µ A = µ B H 1 : µ A � = µ B • Randomize treatments to the two groups via a CRD. • Gather data. • Perform a level α hypothesis test: reject H 0 if | t obs | ≥ t 1 − α/ 2 , n A + n B − 2 . Recall, if α = 0 . 05 and n A , n B are large then t 1 − α/ 2 , n A + n B − 2 ≈ 2.

Confidence intervals Power and Sample Size Determination Type I and Type II error We know that the type I error rate is α = 0 . 05, or more precisely: Pr( type I error | H 0 true ) = Pr( reject H 0 | H 0 true ) = 0 . 05 What about Pr( type II error | H 0 false ) = Pr( accept H 0 | H 0 false ) = 1 − Pr( reject H 0 | H 0 false ) This is not yet a well-defined probability: there are many different ways in which the null hypothesis may be false. • µ B − µ A = 0 . 0001 • µ B − µ A = 10 , 000 These are both instances of the alternative hypothesis. However, all else being equal, we have Pr( reject H 0 | µ B − µ A = . 0001) < Pr( reject H 0 | µ B − µ A = 10 , 000)

Confidence intervals Power and Sample Size Determination Power under alternatives To better define the Type II error-rate better, we need to refer to a specific alternative hypothesis. For example, for a specific difference δ we may want to calculate: 1 − Pr( type II error | µ B − µ A = δ ) = Pr( reject H 0 | µ B − µ A = δ ) . The power of a two-sample t -test test under a specific alternative is: Power( δ ) = Pr( reject H 0 | µ B − µ A = δ ) � � µ B − µ A = δ ) . = Pr( | t ( Y A , Y B ) | ≥ t 1 − α/ 2 , n A + n B − 2 Remember, the “critical” value t 1 − α/ 2 , n A + n B − 2 above which we reject the null hypothesis was computed from the null distribution. Now we want to work out the probability of getting a value of the t -statistic greater than this critical value, when a specific alternative hypothesis is true . Thus we need to compute the distribution of our t -statistic under the specific alternative hypothesis.

Confidence intervals Power and Sample Size Determination Power under alternatives Y 1 , A , . . . , Y n A , A ∼ i.i.d. normal( µ A , σ 2 ) Y 1 , B , . . . , Y n B , B ∼ i.i.d. normal( µ B , σ 2 ) Suppose µ B − µ A = δ . To calculate the power we need the distribution of Y B − ¯ ¯ Y A t ( Y A , Y B ) = . � 1 1 n A + s p n B We know that if µ B − µ A = δ then Y B − ¯ ¯ Y A − δ ∼ t n A + n B − 2 � 1 1 n A + s p n B but unfortunately this is not our test statistic. Instead, Y B − ¯ ¯ Y A − δ δ t ( Y A , Y B ) = + . (1) � � 1 1 1 1 n A + n A + s p s p n B n B

Confidence intervals Power and Sample Size Determination Power under alternatives Y B − ¯ ¯ Y A − δ δ t ( Y A , Y B ) = + . � � 1 1 1 1 n A + n A + s p s p n B n B • The first part in the above equation has a t -distribution, which is centered around zero. • The second part moves the t -statistic away from zero by an amount that depends on the pooled sample variance. For this reason, we call the distribution of the t -statistic under µ B − µ A = δ the non-central t -distribution. In this case, we write   δ t ( Y A , Y B ) ∼ t ∗   � n A + n B − 2 1 1 σ n A + n B � �� non-centrality parameter . Note that this distribution is more complicated than just a t -distribution plus a constant “shift” away from zero. For the t -statistic, the amount of the shift depends on the (random) pooled sample variance.

Confidence intervals Power and Sample Size Determination The non-central t -distribution A noncentral t-distributed random variable can be represented as T = Z + γ � X /ν where • γ is a constant; • Z is standard normal; • X is χ 2 with ν degrees of freedom, independent of Z . The quantity γ is called the noncentrality parameter. Exercise: Using the above representation, show that the distribution of the t -statistic is a non-central t distribution, assuming the data are normal and the variance is the same in both groups.

Confidence intervals Power and Sample Size Determination The non-central t -distribution 0.4 γ = 0 0.3 γ = 1 γ = 2 0.2 0.1 0.0 −2 0 2 4 6 t A t 10 distribution and two non-central t 10 -distributions.

Confidence intervals Power and Sample Size Determination The non-central t -distribution For a non-central t -distribution, • the mean is not zero; • the distribution is not symmetric. It can be shown that � ν 2 Γ( ν − 1 2 ) δ E [ t ( Y A , Y B ) | µ B − µ A = δ ] = × � Γ( ν 2 ) 1 1 σ n A + n B where ν = n A + n B − 2, the degrees of freedom, and Γ( x ) is the gamma function , a generalization of the factorial: • Γ( n + 1) = n ! if n is an integer • Γ( r + 1) = r Γ( r ) 2 ) = √ π • Γ(1) = 1, Γ( 1

Confidence intervals Power and Sample Size Determination The non-central t -distribution You can show that for large ν , � ν 2 Γ( ν − 1 ) / Γ( ν 2 ) ≈ 1 so 2 δ E [ t ( Y A , Y B ) | µ B − µ A = δ ] ≈ � 1 1 σ n A + n B This isn’t really such a big surprise, because we know that: Y B − ¯ ¯ Y A ∼ normal ( δ, σ 2 [1 / n A + 1 / n B ]) . Hence   Y B − ¯ ¯ δ Y A  . ∼ normal , 1  � � 1 1 1 1 σ n A + σ n A + n B n B We also know that for large values of n A , n B , we have s ≈ σ , so the non-central t -distribution will (for large enough n A , n B ) look approximately normal with � • mean δ/ ( σ (1 / n A ) + (1 / n B )); • standard deviation 1.

Confidence intervals Power and Sample Size Determination The non-central t -distribution Another way to get the same result is to refer back to the expression for the t -statistic given in 1: Y B − ¯ ¯ Y A − δ δ t ( Y A , Y B ) = + � � 1 1 1 1 s p n A + s p n A + n B n B = a n A , n b + b n A , n B a n A , n B has a t -distribution, and becomes standard normal as n A , n B → ∞ . p → σ 2 as n A or n B → ∞ , we have As for b n A , n B , since s 2 1 δ → 1 as n A , n B → ∞ . � b n A , n B σ 1 / n A + 1 / n B i.e., b n A , n B ≈ γ , the noncentrality parameter

Confidence intervals Power and Sample Size Determination Computing the Power of a test Recall our level- α testing procedure using the t -test: 1. Sample data, compute t obs = t ( Y A , Y B ) . 2. Compute the p -value, Pr( | T n A + n B − 2 | > | t obs | ). 3. Reject H 0 if the p -value ≤ α ⇔ | t obs | ≥ t 1 − α/ 2 , n A + n B − 2 . For this procedure, we have shown that Pr(reject H 0 | µ B − µ A = 0) = Pr( p -value ≤ α | µ B − µ A = 0)

Confidence intervals Power and Sample Size Determination Computing the Power of a test Recall our level- α testing procedure using the t -test: 1. Sample data, compute t obs = t ( Y A , Y B ) . 2. Compute the p -value, Pr( | T n A + n B − 2 | > | t obs | ). 3. Reject H 0 if the p -value ≤ α ⇔ | t obs | ≥ t 1 − α/ 2 , n A + n B − 2 . For this procedure, we have shown that Pr(reject H 0 | µ B − µ A = 0) = Pr( p -value ≤ α | µ B − µ A = 0) = Pr( | t ( Y A , Y B ) | ≥ t 1 − α/ 2 , n A + n B − 2 | H 0 ) = Pr( | T n A + n B − 2 | ≥ t 1 − α/ 2 , n A + n B − 2 ) = α But what is the probability of rejection under H δ : µ B − µ A = δ ? Hopefully this is bigger than α !

Confidence intervals Power and Sample Size Determination Power Let t c = t 1 − α/ 2 , n A + n B − 2 , the 1 − α/ 2 quantile of a t -distribution with n A + n B − 2 degrees of freedom. Pr( reject H 0 | µ B − µ A = δ ) = Pr ( | t ( Y A , Y B ) | > t c | µ B − µ A = δ ) We will want to make this calculation in order to see if our sample size is sufficient to have a reasonable chance of rejecting the null hypothesis. If we have a rough idea of δ and σ 2 we can evaluate the power using this formula. t . c r i t < − qt ( 1 − alpha /2 , nA + nB − 2 ) t . gamma < − d e l t a /( sigma ∗ s q r t (1/nA + 1/nB )) t . power < − 1 − pt ( t . c r i t , nA+nB − 2 , ncp=t . gamma ) + pt( − t . c r i t , nA+nB − 2 , ncp=t . gamma )

Confidence intervals Power and Sample Size Determination Power Let t c = t 1 − α/ 2 , n A + n B − 2 , the 1 − α/ 2 quantile of a t -distribution with n A + n B − 2 degrees of freedom. Pr( reject H 0 | µ B − µ A = δ ) = Pr ( | t ( Y A , Y B ) | > t c | µ B − µ A = δ ) = Pr ( | T ∗ | > t c ) We will want to make this calculation in order to see if our sample size is sufficient to have a reasonable chance of rejecting the null hypothesis. If we have a rough idea of δ and σ 2 we can evaluate the power using this formula. t . c r i t < − qt ( 1 − alpha /2 , nA + nB − 2 ) t . gamma < − d e l t a /( sigma ∗ s q r t (1/nA + 1/nB )) t . power < − 1 − pt ( t . c r i t , nA+nB − 2 , ncp=t . gamma ) + pt( − t . c r i t , nA+nB − 2 , ncp=t . gamma )

Confidence intervals Power and Sample Size Determination Power Let t c = t 1 − α/ 2 , n A + n B − 2 , the 1 − α/ 2 quantile of a t -distribution with n A + n B − 2 degrees of freedom. Pr( reject H 0 | µ B − µ A = δ ) = Pr ( | t ( Y A , Y B ) | > t c | µ B − µ A = δ ) = Pr ( | T ∗ | > t c ) Pr ( T ∗ > t c ) + Pr ( T ∗ < − t c ) = We will want to make this calculation in order to see if our sample size is sufficient to have a reasonable chance of rejecting the null hypothesis. If we have a rough idea of δ and σ 2 we can evaluate the power using this formula. t . c r i t < − qt ( 1 − alpha /2 , nA + nB − 2 ) t . gamma < − d e l t a /( sigma ∗ s q r t (1/nA + 1/nB )) t . power < − 1 − pt ( t . c r i t , nA+nB − 2 , ncp=t . gamma ) + pt( − t . c r i t , nA+nB − 2 , ncp=t . gamma )

Confidence intervals Power and Sample Size Determination Power Let t c = t 1 − α/ 2 , n A + n B − 2 , the 1 − α/ 2 quantile of a t -distribution with n A + n B − 2 degrees of freedom. Pr( reject H 0 | µ B − µ A = δ ) = Pr ( | t ( Y A , Y B ) | > t c | µ B − µ A = δ ) = Pr ( | T ∗ | > t c ) Pr ( T ∗ > t c ) + Pr ( T ∗ < − t c ) = [1 − Pr ( T ∗ < t c )] + Pr ( T ∗ < − t c ) = We will want to make this calculation in order to see if our sample size is sufficient to have a reasonable chance of rejecting the null hypothesis. If we have a rough idea of δ and σ 2 we can evaluate the power using this formula. t . c r i t < − qt ( 1 − alpha /2 , nA + nB − 2 ) t . gamma < − d e l t a /( sigma ∗ s q r t (1/nA + 1/nB )) t . power < − 1 − pt ( t . c r i t , nA+nB − 2 , ncp=t . gamma ) + pt( − t . c r i t , nA+nB − 2 , ncp=t . gamma )

Confidence intervals Power and Sample Size Determination Power Let t c = t 1 − α/ 2 , n A + n B − 2 , the 1 − α/ 2 quantile of a t -distribution with n A + n B − 2 degrees of freedom. Pr( reject H 0 | µ B − µ A = δ ) = Pr ( | t ( Y A , Y B ) | > t c | µ B − µ A = δ ) = Pr ( | T ∗ | > t c ) Pr ( T ∗ > t c ) + Pr ( T ∗ < − t c ) = [1 − Pr ( T ∗ < t c )] + Pr ( T ∗ < − t c ) = where T ∗ has the non-central t -distribution with = n A + n B − 2 degrees of freedom and non-centrality parameter δ γ = . � 1 1 σ n A + n B We will want to make this calculation in order to see if our sample size is sufficient to have a reasonable chance of rejecting the null hypothesis. If we have a rough idea of δ and σ 2 we can evaluate the power using this formula. t . c r i t < − qt ( 1 − alpha /2 , nA + nB − 2 ) t . gamma < − d e l t a /( sigma ∗ s q r t (1/nA + 1/nB )) t . power < − 1 − pt ( t . c r i t , nA+nB − 2 , ncp=t . gamma ) + pt( − t . c r i t , nA+nB − 2 , ncp=t . gamma )

Confidence intervals Power and Sample Size Determination Critical regions and the non-central t -distribution 0.4 γ γ = 0 0.3 γ = 1 γ 0.2 0.1 0.0 −6 −4 −2 0 2 4 6 t When you do these calculations you should think of this figure. Letting T ∗ and T be non-central and central t -distributed random variables respectively, make sure you can relate the following probabilities to the figure: • Pr( T ∗ > t c ) • Pr( T ∗ < − t c ) • Pr( T > t c ) • Pr( T < − t c ) Note that if the power Pr( | T ∗ | > t c ) is large, then one of Pr( T ∗ > t c ) or Pr( T ∗ < − t c ) will be very close to zero.

Confidence intervals and power Applied Statistics and Experimental - PowerPoint PPT Presentation

Confidence intervals Power and Sample Size Determination Confidence intervals and power Applied Statistics and Experimental Design Chapter 4 Peter Hoff Statistics, Biostatistics and the CSSS University of Washington Confidence intervals

STAT 113 Confidence Intervals Colin Reimer Dawson Oberlin College October 3, 2017 1 / 51

STAT 113 Bootstrap Confidence Intervals Colin Reimer Dawson Oberlin College 3 March 2017

Creating Confidence Intervals using Excel 2013 XL8A-V0R XL8A-V0R XL8A-V0R Create Confidence

Creating Confidence Intervals using Excel 2010 5/08/2015 V0M V0M V0M Create Confidence

Confidence Intervals for Normal Data 18.05 Spring 2014 Agenda Today Review of critical values

Confidence Intervals for Normal Data 18.05 Spring 2014 Agenda Today Review of critical values

Intro to Confidence Intervals SECTION 10.1 1 Confidence Intervals Slides.notebook December 22,

Confidence Intervals for Normal Data 18.05 Spring 2014 Jeremy Orloff and Jonathan Bloom Agenda

Confidence Intervals for Normal Data 18.05 Spring 2014 Jeremy Orloff and Jonathan Bloom Agenda

Confidence Intervals II 18.05 Spring 2014 Jeremy Orloff and Jonathan Bloom Agenda Polling:

Confidence Intervals II 18.05 Spring 2014 Agenda Polling: estimating in Bernoulli( ). CLT

Confidence Intervals II 18.05 Spring 2014 Agenda Polling: estimating in Bernoulli( ). CLT

M5S1 - Confidence Intervals Professor Jarad Niemi STAT 226 - Iowa State University October 9,

I05 - Confidence intervals STAT 587 (Engineering) Iowa State University September 24, 2020

CS70: Jean Walrand: Lecture 29. Confidence? Confidence? Confidence is essential is many

Introductory Statistics Day 24 One sample means - Confidence Intervals 4.3 One sample

a culture of failure mathias meyer, @roidrage travis-ci.org / travis-ci.com failure risk 28

Development of a risk assessment strategy within the GUIDEnano project Dr. Susan Wijnhoven

Risk thinking and nuclear power Cathryn Carson Societal Risks

Physics 2D Lecture Slides Lecture 9 : Jan 19th 2005 Vivek Sharma UCSD Physics Definition

Tsybakov noise adap/ve margin-based ac/ve learning Aar$ Singh

recent GEM developments Radoslaw Karabowicz GSI Darmstadt Geometry update Layout study

Agenda 2 6:00 p.m. Light Supper and meet and greet 6:30 p.m. Welcome 6:45 p.m.

Measurement of the atmospheric lepton energy spectra with AMANDA-II presented by Jan Lnemann*