Introductory Statistics Refresher Dr. Julia L. Sharp Short Course - - PowerPoint PPT Presentation

introductory statistics refresher
SMART_READER_LITE
LIVE PREVIEW

Introductory Statistics Refresher Dr. Julia L. Sharp Short Course - - PowerPoint PPT Presentation

Introductory Statistics Refresher Dr. Julia L. Sharp Short Course on Introductory Statistics Part III Sharp (Clemson University) ASA 1 / 26 Hypothesis Testing As an example, suppose that I claim that I am excellent free throw shooter,


slide-1
SLIDE 1

Introductory Statistics Refresher

  • Dr. Julia L. Sharp

Short Course on Introductory Statistics Part III

Sharp (Clemson University) ASA 1 / 26

slide-2
SLIDE 2

Hypothesis Testing

As an example, suppose that I claim that I am excellent free throw shooter, making 80% or more of my free throw shots. Given a claim. Gathered evidence. Assessed the evidence using the claim.

Sharp (Clemson University) ASA 2 / 26

slide-3
SLIDE 3

Hypothesis Testing

State the null and alternative hypotheses. State the Type I and Type II Errors for the hypotheses. State the level of significance (maximum acceptable α). Check assumptions. Compute the test statistic. Calculate the p-value. Compare the p-value with the level of significance. Make a decision regarding the null hypothesis. Draw a conclusion in terms of the problem.

Sharp (Clemson University) ASA 3 / 26

slide-4
SLIDE 4

Hypothesis Testing Definitions

Null Hypothesis: (Ho) a statement of no effect or no change. This statement is assumed to be true unless sufficient evidence is gathered to reject this hypothesis. Alternative Hypothesis: (Ha) the research hypothesis. This is the statement that one wishes to support as being true. This is done by gathering evidence against the null hypothesis. Type I Error: an error that occurs if the null hypothesis is rejected when it is true.

The probability of a Type I error is denoted as α

Type II Error: an error that occurs if the null hypothesis is not rejected when it is false.

The probability of a Type II error is denoted as β

Sharp (Clemson University) ASA 4 / 26

slide-5
SLIDE 5

Hypothesis Testing Definitions

State of Nature Ho is True Ho is False Reject Ho Fail to Reject Ho

Sharp (Clemson University) ASA 5 / 26

slide-6
SLIDE 6

More Hypothesis Testing Definitions

Test statistic: a quantity computed from sample data that depends on the value of the parameter begin tested Level of significance: the maximum allowable chance of making a Type I error that the researcher is willing to accept P-value: the probability, computed assuming the null hypothesis is true, that a test statistic will be as or more extreme than the test statistic that was actually observed.

Sharp (Clemson University) ASA 6 / 26

slide-7
SLIDE 7

Small Sample P-value Method: Ho : µ = µ0

tobs = y − µ0 s/√n

Ha : µ < µ0 P-value:

P(T < tobs)

Ha : µ > µ0 P-value:

P(T < tobs)

Ha : µ = µ0 P-value:

P(T < tobs)

Decision Rule:

Sharp (Clemson University) ASA 7 / 26

slide-8
SLIDE 8

P-value Method Example

Suppose that we would like to conduct a test to determine if the average Phosphorus leaching is less than 50mm. Recall that the sample mean from 32 lysometer samples is 44.7166 and the sample standard deviation is 7.8069. Use a significance level of 0.05. State the hypotheses. Compute the test statistic. Determine the p-value.

Sharp (Clemson University) ASA 8 / 26

slide-9
SLIDE 9

P-value Method Example

Suppose that we would like to conduct a test to determine if the average Phosphorus leaching is less than 50mm. Use a significance level of 0.05. Make a decision regarding Ho. State the conclusion in terms of the problem.

Sharp (Clemson University) ASA 9 / 26

slide-10
SLIDE 10

Example

Riddle and Bergström (2013) describe several experiments to examine Phosphorus leaching from two soils. A table of results from one of the experiments is reproduced below. There were four different rain simulations used and two soil types (clay and sand). The amount of drainage water collected from lysimeters was recorded.

Riddle, M. U. and Bergström, L. (2013). “Phosphorus leaching from two soils with catch crops

Sharp (Clemson University) ASA 10 / 26

slide-11
SLIDE 11

Hypothesis Test: Phosphorus Leaching

Conduct a test to determine if the average Phosphorus leaching is less than 50mm.

One Sample t-test data: drain$drainage t = -3.8283, df = 31, p-value = 0.0002936 alternative hypothesis: true mean is less than 50 95 percent confidence interval:

  • Inf 47.05657

sample estimates: mean of x 44.71664

Sharp (Clemson University) ASA 11 / 26

slide-12
SLIDE 12

Inferences Comparing Two Population Central Values

Compare the average responses in two groups. Assumptions:

Independent random samples of n1 observations from one population and n2 observations from a second population are selected. Samples are selected from normal distributions or large sample sizes are used.

GOAL: Make inference about the difference between the population means.

Population Sample Mean Standard Deviation Size Mean Standard Deviation 1 2

Sharp (Clemson University) ASA 12 / 26

slide-13
SLIDE 13

Inference for Two Population Means: Example

Riddle and Bergström (2013) describe several experiments to examine Phosphorus leaching from two soils. A table of results from one of the experiments is reproduced below. There were four different rain simulations used and two soil types (clay and sand). The amount of drainage water collected from lysimeters was recorded. Suppose that we would like to compare the average amount

  • f drainage water collected

from clay soil to the average amount of drainage water col- lected from sandy soil.

Sharp (Clemson University) ASA 13 / 26

slide-14
SLIDE 14

Sampling Distribution of Y 1 − Y 2

Suppose two independent random variables Y1 and Y2 are normally distributed with appropriate means and variances: The sampling distributions of Y 1 and Y 2 are: The sampling distribution of Y 1 − Y 2 is: The mean of the sampling distribution is: The standard error of the sampling distribution is:

Sharp (Clemson University) ASA 14 / 26

slide-15
SLIDE 15

Inference for Comparing Two Population Means: Independent Samples

★ ✧ ✥ ✦

σ2

1 and σ2 2

✬ ✫ ✩ ✪

Unequal σ2

1 = σ2 2

❅ ❅ ❅ ❅ ❘ ✬ ✫ ✩ ✪

Equal σ2

1 = σ2 2

Variance of ¯ Y1 − ¯ Y2 Variance Estimate

✬ ✫ ✩ ✪

s2

p

1 n1 + 1 n2

  • σ2

1 n1 + 1 n2

✫ ✩ ✪ ❅ ❅ ❘

σ2

1

n1 + σ2

2

n2 s2

1

n1 + s2

2

n2

Sharp (Clemson University) ASA 15 / 26

slide-16
SLIDE 16

Independent Samples, Equal Variances: Hypothesis Tests for Comparing Two Population Means

Ho : µ1 − µ2 = D0

Ha : µ1 − µ2 < D0 Ha : µ1 − µ2 > D0 Ha : µ1 − µ2 = D0

Test statistic: tobs = (y1 − y2) − D0 sp 1 n1 + 1 n2 where s2

p = (n1 − 1)s2 1 + (n2 − 1)s2 2

n1 + n2 − 2

Sharp (Clemson University) ASA 16 / 26

slide-17
SLIDE 17

Independent Samples, Equal Variances: Hypothesis Test P-values

Ho : µ1 − µ2 = D0

Ha : µ1 − µ2 < D0

P-value:

P(T < tobs) Ha : µ1 − µ2 > D0 P-value: P(T > tobs) Ha : µ1 − µ2 = D0 P-value: 2P(T > |tobs|)

Decision Rule:

Sharp (Clemson University) ASA 17 / 26

slide-18
SLIDE 18

Random Assignment of Treatment to Experimental Units

Is the average amount of drainage water collected from clay soil different from the average amount of drainage water collected from sandy soil?

Sharp (Clemson University) ASA 18 / 26

slide-19
SLIDE 19

Inference for Two Means

Is the average amount of drainage water collected from clay soil different from the average amount of drainage water collected from sandy soil? Use a significance level of 0.05.

Sharp (Clemson University) ASA 19 / 26

slide-20
SLIDE 20

Inference for Two Means, Independent Samples, Equal Variances: Confidence Interval

A 100(1 − α)% confidence interval for the difference in population means is (y1 − y2) ± tα/2,(n1+n2−2)

  • sp
  • 1

n1 + 1 n2

  • Two Sample t-test

data: drain$drainage by drain$soil t = 1.5148, df = 30, p-value = 0.1403 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval:

  • 1.42650

9.61915 sample estimates: mean in group clay mean in group sand 46.76480 42.66848

Sharp (Clemson University) ASA 20 / 26

slide-21
SLIDE 21

Inference for Comparing Two Population Means: Independent Samples

★ ✧ ✥ ✦

σ2

1 and σ2 2

✬ ✫ ✩ ✪

Unequal σ2

1 = σ2 2

❅ ❅ ❅ ❅ ❘ ✬ ✫ ✩ ✪

Equal σ2

1 = σ2 2

Variance of ¯ Y1 − ¯ Y2 Variance Estimate

✬ ✫ ✩ ✪

s2

p

1 n1 + 1 n2

  • σ2

1 n1 + 1 n2

✫ ✩ ✪ ❅ ❅ ❘

σ2

1

n1 + σ2

2

n2 s2

1

n1 + s2

2

n2

Sharp (Clemson University) ASA 21 / 26

slide-22
SLIDE 22

Independent Samples, Unequal Variances: Hypothesis Tests for Comparing Two Population Means

Ho : µ1 − µ2 = D0

Ha : µ1 − µ2 < D0 Ha : µ1 − µ2 > D0 Ha : µ1 − µ2 = D0

Test statistic: t′

  • bs = (y1 − y2) − D0
  • s2

1

n1 + s2

2

n2

Sharp (Clemson University) ASA 22 / 26

slide-23
SLIDE 23

Distribution of the Test Statistic

t′ ˙ ∼ t(df) where df = (n1 − 1)(n2 − 1) (1 − c)2(n1 − 1) + c2(n2 − 1) and c = s2

1

n1 s2

1

n1 + s2

2

n2

Sharp (Clemson University) ASA 23 / 26

slide-24
SLIDE 24

Independent Samples, Unequal Variances: Hypothesis Test P-values

Ho : µ1 − µ2 = D0

Ha : µ1 − µ2 < D0

P-value:

P(T < tobs) Ha : µ1 − µ2 > D0 P-value: P(T > tobs) Ha : µ1 − µ2 = D0 P-value: 2P(T > |tobs|)

Decision Rule:

Sharp (Clemson University) ASA 24 / 26

slide-25
SLIDE 25

Inference for Two Means (Unequal Variances): Example Using PROC TTEST

Is the average amount of drainage water collected from clay soil different from the average amount of drainage water collected from sandy soil? Use a significance level of 0.05.

Sharp (Clemson University) ASA 25 / 26

slide-26
SLIDE 26

Inference for Two Means, Independent Samples, Unequal Variances: Confidence Interval

A 100(1 − α)% confidence interval for the difference in population means when the population variances are not equal is (y1 − y2) ± tα/2,df

  • s2

1

n1 + s2

2

n2 where df = (n1 − 1)(n2 − 1) (1 − c)2(n1 − 1) + c2(n2 − 1) and c = s2

1

n1 s2

1

n1 + s2

2

n2

Sharp (Clemson University) ASA 26 / 26