modeling Dongmei Li Department of Public Health Sciences Office of - - PowerPoint PPT Presentation

modeling
SMART_READER_LITE
LIVE PREVIEW

modeling Dongmei Li Department of Public Health Sciences Office of - - PowerPoint PPT Presentation

Quantitative response variable modeling Dongmei Li Department of Public Health Sciences Office of Public Health Studies University of Hawaii at M noa Outline T-test ANOVA Correlation and simple linear regression 2 One-sample


slide-1
SLIDE 1

Dongmei Li Department of Public Health Sciences Office of Public Health Studies University of Hawai’i at Mānoa

Quantitative response variable modeling

slide-2
SLIDE 2

Outline

2

 T-test  ANOVA  Correlation and simple linear regression

slide-3
SLIDE 3

One-sample t test

3

slide-4
SLIDE 4

One-Sample t Test: Example

Statement of the problem:

 Do Sudden Infant Death Syndrome (SIDS) babies have

lower than average birth weights?

 We know from prior research that the mean birth

weight of the non-SIDS babies in the population is 3300 grams.

 We study n = 10 SIDS babies, determine their birth

weights, and calculate x-bar = 2975.5 and s = 737.3.

 Do these data provide significant evidence that SIDS

babies have different birth weights than the rest of the population?

SIDS baby weights: 2229 2997 2314 3831 1788 2745 4151 2975 3463 3262

4

slide-5
SLIDE 5

One-Sample t Test

  • A. Hypotheses. H0: µ = µ0 vs. Ha: µ ≠ µ0 (two-sided) [ Ha: µ < µ0

(left-sided) or Ha: µ > µ0 (right-sided)]

  • B. Test statistic.
  • C. P-value. Convert tstat to P-value [software]. Small P  strong

evidence against H0

  • D. Significance level α (compare P-value with α to determine

whether you will reject the null hypothesis or not).

1 with

stat

    n df n s x t 

5

slide-6
SLIDE 6

A.

H0: µ = 3300 versus Ha: µ ≠ 3300 (two-sided)

  • B. Test statistic
  • C. P = 0.1974

Weak evidence against H0

  • D. Data are not significant at α = .10. Fail to reject the null

hypothesis. 9 1 10 1 39 . 1 10 3 . 737 3300 5 . 2975

stat

           n df SE x t

x

One-Sample t Test: Example

6

slide-7
SLIDE 7

Confidence Interval for µ

n s t x

n

   

 

2

1 , 1

for CI % 100 ) 1 (

 

  • Typical point “estimate ± margin of error” formula
  • tn-1,1-α/2 is from t table
  • Alternative formula:

n s SE SE t x

x x n

  

 

where

2

1 , 1

 7

slide-8
SLIDE 8

Confidence Interval: Example 1

grams 3502.9) to (2448.1 = 527.4 ± 5 . 2975 10 3 . 737 262 . 2 5 . 2975 for CI 95% 10 3 . 737 5 . 2975

2 05 .

1 , 1 10

         

 

n s t x n s x 

Let us calculate a 95% confidence interval for μ for the birth weight of SIDS babies.

8

slide-9
SLIDE 9

How to do it in Excel?

Open the Presentation3_SIDS_BW .xlsx data set Use the AVRAGE functions in Excel to calculate mean Use the STDEV functions in Excel to calculate standard deviation Use the TDIST function to calculate p-value Use the TINV function to get critical value

9

slide-10
SLIDE 10

How to do it in JMP?

 Open the Presentation3_SIDS_BW

.jmp data set

 Analyze---Distribution---Put SIDS_BW into

Y, Column box--

  • click OK---click on the red arrow next to SIDS_BW---click
  • n test mean---input 3300 for hypothesized mean---click OK

10

Click confidence interval---95 to get 95% confidence interval

slide-11
SLIDE 11

Paired-sample t test

11

slide-12
SLIDE 12

Paired Samples

 Paired samples: Each point in one sample is

matched to a unique point in the other sample

 Pairs be achieved via sequential samples within

individuals (e.g., pre-test/post-test), cross-over trials, and match procedures

 Also called “matched-pairs” and “dependent

samples”

12

slide-13
SLIDE 13

Example: Paired Samples

 A study addresses whether oat bran reduce LDL cholesterol

with a cross-over design.

 Subjects “cross-over” from a cornflake diet to an oat bran

diet.

 Half subjects start on CORNFLK, half on OATBRAN  Two weeks on diet 1  Measures LDL cholesterol  Washout period  Switch diet  Two weeks on diet 2  Measures LDL cholesterol

13

slide-14
SLIDE 14

Example, Data

Subject CORNFLK OATBRAN

  • --- ------- -------

1 4.61 3.84 2 6.42 5.57 3 5.40 5.85 4 4.54 4.80 5 3.98 3.68 6 3.82 2.96 7 5.01 4.41 8 4.34 3.72 9 3.80 3.49 10 4.56 3.84 11 5.35 5.26 12 3.89 3.73

14

slide-15
SLIDE 15

Calculate Difference Variable “DELTA”

 Step 1 is to create difference variable “DELTA”  Let DELTA = CORNFLK - OATBRAN  Order of subtraction does not materially effect results (but does change sign

  • f differences)

 Here are the first three observations:

Positive values represent lower LDL

  • n oatbran

ID CORNFLK OATBRAN DELTA

  • --- ------- ------- -----

1 4.61 3.84 0.77 2 6.42 5.57 0.85 3 5.40 5.85 -0.45 ↓ ↓ ↓ ↓

15

slide-16
SLIDE 16

Explore DELTA Values

Stemplot |-0|42 |+0|0133 |+0|667788 ×1

Here are all the twelve paired differences (DELTAs):

0.77, 0.85, −0.45, −0.26, 0.30, 0.86, 0.60, 0.62, 0.31, 0.72, 0.09, 0.16

EDA shows a slight negative skew, a median of about 0.45, with results varying from −0.4 to 0.8.

16

slide-17
SLIDE 17

Descriptive stats for DELTA

Data (DELTAs): 0.77, 0.85, −0.45, −0.26, 0.30, 0.86, 0.60, 0.62, 0.31, 0.72, 0.09, 0.16 0.4335 0.3808 12   

d d

s x n The subscript d will be used to denote statistics for difference variable DELTA

17

slide-18
SLIDE 18

95% Confidence Interval for µd

n s t x

d n d d

   

 

2

1 , 1

for CI % 100 ) 1 (

 

A t procedure directed toward the DELTA variable calculates the confidence interval for the mean difference.

) 656 . to 105 . ( 2754 . 0.3808 12 4335 . 201 . 2 0.3808 for CI % 95 Table) t (from 201 . 2 use confidence 95% For

975 ,. 11 1 1 12

2 05

       

  d ,

t t

.

“Oat bran” data:

18

slide-19
SLIDE 19

Paired t Test

  • Similar to one-sample t test

μ0 is usually set to 0, representing “no mean difference”, i.e., H0: μ = 0

  • Test statistic:

n df n s x t

d d

1

stat

    

19

slide-20
SLIDE 20

Paired t Test: Example

“Oat bran” data

  • A. Hypotheses. H0: µd = 0 vs. Ha: µd  0
  • B. Test statistic.
  • C. P-value. P = 0.011 (via computer). The evidence against

H0 is statistically significant.

  • D. Significance level. The evidence against H0 is significant at

α = .05 but is not significant at α = .01.

11 1 12 1 043 . 3 12 / 4335 . 38083 .

stat

          n df n s x t

d

20

slide-21
SLIDE 21

How to do it in Excel?

 Open the Presentation3_cornflk.xlsx data set  Paired sample t test in Excel

21

slide-22
SLIDE 22

How to do it in Excel?

22

slide-23
SLIDE 23

Results and interpretation

 P-value = 0.0112: have evidence to show Oatbran can

significantly lower LDL cholesterol level compared to Cornflk.

23

slide-24
SLIDE 24

How to do it in JMP?

24

slide-25
SLIDE 25

How to do it in JMP?

25

P-value = 0.0112: have evidence to show Oatbran can significantly lower LDL cholesterol level compared to Cornflk. Notice the 95% confidence interval for their difference does not include 0.

slide-26
SLIDE 26

Conditions for Inference

t procedures require these conditions:

 SRS (individual observations or DELTAs)  Valid information (no information bias)  Normal population or large sample (central limit theorem)

26

slide-27
SLIDE 27

The Normality Condition

The Normality condition applies to the sampling distribution of the mean, not the population. Therefore, it is OK to use t procedures when:

 The population is Normal  Population is not Normal but is symmetrical and n is at

least 5 to 10

 The population is skewed and the n is at least 30 to 100

(depending on the extent of the skew)

27

slide-28
SLIDE 28

Can a t procedures be used?

 Dataset A is skewed and

small: avoid t procedures

 Dataset B has a mild skew

and is moderate in size: use t procedures

 Data set C is highly skewed

and is small: avoid t procedure

28

slide-29
SLIDE 29

Two-independent sample t test

29

slide-30
SLIDE 30

Example: Cholesterol and Type A & B

Personality

Group 1 (Type A personality): 233, 291, 312, 250, 246, 197, 268, 224, 239, 239, 254, 276, 234, 181, 248, 252, 202, 218, 212, 325 Group 2 (Type B personality): 344, 185, 263, 246, 224, 212, 188, 250, 148, 169, 226, 175, 242, 252, 153, 183, 137, 202, 194, 213

Do fasting cholesterol levels differ in Type A and Type B personality men? Data (mg/dl) are a subset from the Western Collaborative Group Study*

30

slide-31
SLIDE 31

Exploratory & Descriptive Methods

 Start with EDA  Compare group shapes, locations

and spreads

 Examples of applicable techniques

Side-by-side stemplots (right) Side-by-side boxplots (next slide)

Group 1 | | Group 2

  • |1t|3

|1f|45 |1s|67 98|1.|8889 110|2*|011 33332|2t|22 55544|2f|4455 76|2s|6 9|2.| 21|3*| |3t| |3f|4 (×100)

31

slide-32
SLIDE 32

Side-by-Side Boxplots

20 20 N =

GROUP

2 1 400 300 200 100

21 20

Interpretation :

  • Location:

group 1 > group 2

  • Spreads:

group 1 < group 2

  • Shapes: Both fairly

symmetrical, outside values in each; no major departures from Normality

32

slide-33
SLIDE 33

Summary Statistics

Group n mean std dev 1 20 245.05 36.64 2 20 210.30 48.34

33

slide-34
SLIDE 34

Inference About Mean Difference (Notation)

Parameters (population) Group 1 N1 µ1 σ1 Group 2 N2 µ2 σ2 Statistics (sample) Group 1 n1 s1 Group 2 n2 s2

1

x

2

x

2 1 2 1

  • f

estimator point the is     x x

34

slide-35
SLIDE 35

Hypothesis Test

A.

Hypotheses. H0: μ1 = μ2 against Ha: μ1 ≠ μ2 (two-sided) [Ha: μ1 > μ2 (right-sided) Ha: μ1 < μ2 (left-sided) ]

B.

Test statistic.

  • C. P-value. Convert the tstat to P-value with t table or software.

Interpret.

  • D. Significance level (optional). Compare P to prior

specified α level.

Welch 2 2 2 1 2 1 2 1 stat

2 1 2 1

where ) ( df n s n s SE SE x x t

x x x x

   

 

35

slide-36
SLIDE 36

Hypothesis Test – Example

  • A. Hypotheses. H0: μ1 = μ2 vs. Ha: μ1 ≠ μ2
  • B. Test stat. In prior analyses we calculated sample mean

difference = 34.75 mg/dL, SE = 13.563 and dfconserv = 19.

  • C. P-value. P = 0.019 → good evidence against H0 (“significant

difference”).

  • D. Significance level (optional). The evidence against H0 is

significant at α = 0.05 but not at α = 0.01. df SE x x t

x x

19 with 2.56 13.563 34.75 ) (

2 1

2 1 stat

   

36

slide-37
SLIDE 37

Equal Variance t Procedure

 Also called pooled variance t procedure  Not as robust as prior method, but…  Historically important  Calculated by software programs  Leads to advanced ANOVA techniques

37

slide-38
SLIDE 38

We start by calculating this pooled estimate of variance

1 and group in variance the is where ) )( ( ) )( (

2 2 1 2 2 2 2 1 1 2

    

i i i pooled

n df i s df df s df s df s

Pooled variance procedure

38

slide-39
SLIDE 39

 The pooled variance is used to calculate this standard

error estimate:

 Confidence Interval  Test statistic  All with df = df1 + df2 = (n1−1) + (n2−1) 1 1

2 1 2

2 1

         

n n s SE

pooled x x

) )( ( ) (

2 1 2

1 , 2 1 x x df

SE t x x

 

 

) (

2 1

2 1 stat x x

SE x x t

 

39

slide-40
SLIDE 40

Pooled Variance t Confidence Interval

38 ) 1 20 ( ) 1 20 ( 56 . 13 20 1 20 1 1839.623

2 1

             

df SE

x x

62.14) (7.36, 39 . 27 75 . 34 ) 13.56 )( 02 . 2 ( ) 30 . 210 05 . 245 ( ) )( ( ) ( for CI % 95

2 1

975 ,. 38 2 1 2 1

         

 x x

SE t x x  

Group ni si xbari 1 20 36.64 245.05 2 20 48.34 210.30

Data

40

slide-41
SLIDE 41

Pooled Variance t Test

38 ) 1 20 ( ) 1 20 ( 56 . 13 20 1 20 1 1839.623

2 1

             

df SE

x x

014 . 38 2.56; 56 . 13 75 . 34 : :

2 1

2 1 stat 2 1 2 1

       

P df SE x x t H H

x x a

   

Data: Group ni si xbari 1 20 36.64 245.05 2 20 48.34 210.30

41

slide-42
SLIDE 42

How to do it in Excel?

 Data set: Presentation3_FCL.xlsx  First do Levene’s test to see whether two group has equal

variance

42

slide-43
SLIDE 43

How to do it in Excel?

 Levene’s test shows no significant difference in variance

between groups

43

slide-44
SLIDE 44

How to do it in Excel?

 Next use t-Test: Two-Sample Assuming Equal Variances for

the test

44

slide-45
SLIDE 45

How to do it in Excel?

 Click OK to get results

45

slide-46
SLIDE 46

Excel results

 p-value = 0.014  Significant difference

in fasting cholesterol levels between Type A personality subjects and Type B personality subjects.

46

slide-47
SLIDE 47

How to do it in JMP?

 Data set: Presentation3_FCL.jmp  Analyze --- Fit

Y by X

47

slide-48
SLIDE 48

How to do it in JMP?

 Select

Means/ANOVA/ Pooled t for the equal variance t test

 Select t Test for

unequal variance t test

48

slide-49
SLIDE 49

Results from JMP

 p-value <0.05  Significant

difference in fasting cholesterol levels between Type A personality subjects and Type B personality subjects.

49

slide-50
SLIDE 50

Conditions for Inference

Conditions required for t procedures: “Validity conditions”

  • a. Good information (no information bias)
  • b. Good sample (“no selection bias”)
  • c. “No confounding”

“Sampling conditions”

  • a. Independence
  • b. Normal sampling distribution

50

slide-51
SLIDE 51

ANOVA

51

slide-52
SLIDE 52

52

Illustrative Example: Data

Pets as moderators of a stress response. This chapter follows the analysis of data from a study in which heart rates (bpm) of participants were monitored after being exposed to a psychological stressor. Participants were randomized to one of three groups:

 Group 1 - monitored in presence of pet dog  Group 2 - monitored in the presence of human friend  Group 3 - monitored with neither dog nor human friend present

slide-53
SLIDE 53

53

Illustrative Example: Data

slide-54
SLIDE 54

54

Descriptive Statistics

 Data are described and explored before moving to inferential

calculations

 Here are summary statistics by group:

slide-55
SLIDE 55

55

Side-by-Side Boxplots

slide-56
SLIDE 56

56

Analysis of Variance

 One-way ANalysis Of VAriance (ANOVA)

 Categorical explanatory variable  Quantitative response variable  Test group means for a significant difference

 Statistical hypotheses

 H0: μ1 = μ2 = … = μk  Ha: at least one of the μis differ

 Method: compare variability between groups to variability within

groups (F statistic)

slide-57
SLIDE 57

57

Analysis of Variance, cont.

  • R. A. Fisher

(1890-1962) The F in the F statistic stands for “Fisher”

slide-58
SLIDE 58

58

Mean Square Between: Graphically

slide-59
SLIDE 59

59

Mean Square Between: Example

slide-60
SLIDE 60

60

Mean Square Within: Graphically

slide-61
SLIDE 61

61

Mean Square Within: Example

slide-62
SLIDE 62

62

The e F F sta stati tisti stic a c and nd AN ANOVA A ta table

 Data are arranged to form an ANOVA table  F statistic is the ratio of the MSB to MSW

08 . 14 793 . 84 843 . 1193    MSW MSB Fstat

Fstat “signal-to- noise” ratio

slide-63
SLIDE 63

63

Fstat and P-value

 The Fstat has numerator and denominator degrees of

freedom: df1 and df2 respectively (corresponding to dfB and dfW)

 Convert Fstat to P-value with a computer program  The P-value corresponds to the area in the right tail

beyond

slide-64
SLIDE 64

64

Fstat and P-value

P < 0.001

slide-65
SLIDE 65

How to do one-way ANOVA in EXCEL?

65

 Data set: Presentation3_pet.xlsx

slide-66
SLIDE 66

How to do one-way ANOVA in EXCEL?

66

slide-67
SLIDE 67

Analysis results from Excel

 One-way ANOVA shows significant difference in mean FEV

values among the four different smoker groups.

67

slide-68
SLIDE 68

How to do one-way ANOVA in JMP?

 Presentation3_pet.jmp file  Analyze---Fit

Y by X

68

slide-69
SLIDE 69

How to do one-way ANOVA in JMP?

69

slide-70
SLIDE 70

Analysis results from JMP

 Pairwise comparisons

from Tukey’s method shows the signficant difference among all three groups.

70

slide-71
SLIDE 71

Correlation and simple linear regression

71

slide-72
SLIDE 72

72

Data type for correlation and regression

 Quantitative response variable Y (“dependent variable”)  Quantitative explanatory variable X (“independent variable”)  Historically important public health data set used to

illustrate techniques (Doll, 1955)

 n = 11 countries  Explanatory variable = per capita cigarette consumption in 1930

(CIG1930)

 Response variable = lung cancer mortality per 100,000

(LUNGCA)

slide-73
SLIDE 73

73

Data, cont.

slide-74
SLIDE 74

74

Scatterplot

Bivariate (xi, yi) points plotted as scatter plot.

slide-75
SLIDE 75

75

Inspect scatterplot’s

 Form: Can the relation be described with a straight or some

  • ther type of line?

 Direction: Do points tend trend upward or downward?  Strength of association: Do point adhere closely to an

imaginary trend line?

 Outliers (in any): Are there any striking deviations from

the overall pattern?

slide-76
SLIDE 76

Judging Correlational Strength

 Correlational strength refers

to the degree to which points adhere to a trend line

 The eye is not a good judge of

strength.

 The top plot appears to show a

weaker correlation than the bottom plot. However, these are plots of the same data sets. (The perception of a difference is an artifact of axes scaling.)

76

slide-77
SLIDE 77

Correlation

Correlation coefficient r quantifies linear relationship with a number between −1 and 1.

When all points fall on a line with an upward slope, r = 1. When all data points fall on a line with a downward slope, r = −1

When data points trend upward, r is positive; when data points trend downward, r is negative.

The closer r is to 1 or −1, the stronger the correlation.

77

slide-78
SLIDE 78

Examples of correlations

78

slide-79
SLIDE 79

Calculating r (Pearson Correlation)

 Formula

Correlation coefficient tracks the degree to which X and Y “go together.”

 Recall that z scores quantify the amount a value lies above or

below its mean in standard deviations units.

 When z scores for X and

Y track in the same direction, their products are positive and r is positive (and vice versa).

79

slide-80
SLIDE 80

Calculating r, Example

80

slide-81
SLIDE 81

Scatter plot and r in Excel

 Data set: Presentation3_CIGLungCA.xlsx  Scatter plot: select the data in column B and C---Insert ---

Scatter Plot---Add x and y axis labels

 Correlation:  Data analysis---correlation

81

5 10 15 20 25 30 35 40 45 50 200 400 600 800 1000 1200 1400 Lung cancer mortality CIG1930

Lung cancer mortality vs. cigarette consumption

slide-82
SLIDE 82

Scatter plot and r in JMP

 Data set: Presentation3_CIGLungCA.jmp  Scatter plot: Graph---Scatter plot matrix

82

slide-83
SLIDE 83

Interpretation of r

1.

  • Direction. The sign of r indicates the direction of the

association: positive (r > 0), negative (r < 0), or no association (r ≈ 0).

2.

  • Strength. The closer r is to 1 or −1, the stronger the

association.

3.

Coefficient of determination. The square of the correlation coefficient (r2) is called the coefficient of

  • determination. This statistic quantifies the proportion of

the variance in Y [mathematically] “explained” by X. For the illustrative data, r = 0.737 and r2 = 0.54. Therefore, 54% of the variance in Y is explained by X.

83

slide-84
SLIDE 84

Notes, cont.

  • 4. Reversible relationship. With correlation, it does not

matter whether variable X or Y is specified as the explanatory variable; calculations come out the same either

  • way. [This will not be true for regression.]
  • 5. Outliers. Outliers can have

a profound effect on r. This figure has an r of 0.82 that is fully accounted for by the single outlier.

84

slide-85
SLIDE 85

Notes, cont.

  • 6. Linear relations only. Correlation applies only to linear

relationships This figure shows a strong non-linear relationship, yet r = 0.00.

  • 7. Correlation does not

necessarily mean causation. Beware lurking variables (next slide).

85

slide-86
SLIDE 86

Confounded Correlation

A near perfect negative correlation (r = −.987) was seen between cholera mortality and elevation above sea level during a 19th century epidemic.

We now know that cholera is transmitted by water. The

  • bserved relationship

between cholera and elevation was confounded by the lurking variable proximity to polluted water.

86

slide-87
SLIDE 87

Hypothesis Test

We conduct the hypothesis test to guard against identifying too many random correlations. Random selection from a random scatter can result in an apparent correlation

87

slide-88
SLIDE 88

Hypothesis Test

  • A. Hypotheses. Let ρ represent the population

correlation coefficient. H0: ρ = 0 vs. Ha: ρ ≠ 0 (two-sided)

[or Ha: ρ > 0 (right-sided) or Ha: ρ < 0 (left-sided)]

  • B. Test statistic
  • C. P-value. Convert tstat to P-value with software or t

table.

2 2 1 where

2 stat

      n df n r SE SE r t

r r

88

slide-89
SLIDE 89

Hypothesis Test – Illustrative Example

A. H0: ρ = 0 vs. Ha: ρ ≠ 0 (two-sided) B. Test stat

  • C. .005 < P < .01 by Table C. P = .0097 by computer.

The evidence against H0 is highly significant.

9 2 11 3.27 0.2253 737 . 0.2253 2 11 737 . 1

stat 2

         df t SE r

89

slide-90
SLIDE 90

Exercise (True/False)

1. Correlation coefficient r quantifies the

relationship between quantitative variables X and Y.

2. The closer r is to 1, the stronger the

linear relation between X and Y.

3. If r is close to zero, X and Y are unrelated. 4. The value of r changes when the units of

measure are changed.

90

slide-91
SLIDE 91

Regression

 Regression describes the

relationship in the data with a line that predicts the average change in Y per unit X.

 The best fitting line is

found by minimizing the sum of squared residuals, as shown in this figure.

91

slide-92
SLIDE 92

Regression Line, cont.

The regression line equation is:

where ŷ ≡ predicted value of Y, a ≡ the intercept of the line, and b ≡ the slope of the line

Equations to calculate a and b

SLOPE: INTERCEPT:

92

slide-93
SLIDE 93

Regression Line, cont.

Slope b is the key statistic produced by the regression

93

slide-94
SLIDE 94

Calculate regression line in Excel

 Data analysis---regression

94

slide-95
SLIDE 95

Calculate regression line in JMP

 Analyze---fit

Y by X

95

slide-96
SLIDE 96

Conditions for Inference

Inference about the regression line requires these conditions

 Linearity  Independent observations  Normality at each level of X  Equal variance at each level of X

96

slide-97
SLIDE 97

Conditions for Inference

This figure illustrates Normal and equal variation around the regression line at all levels of X

97

slide-98
SLIDE 98

Assessing Conditions

 The scatterplot should be visually inspected for linearity,

Normality, and equal variance

 Plotting the residuals from the model can be helpful in this

regard.

 The table lists residuals for the illustrative data

98

slide-99
SLIDE 99

Assessing Conditions, cont.

 A stemplot of the residuals show

no major departures from Normality

 This residual plot shows more

variability at higher X values (but the data is very sparse)

|-1|6 |-0|2336 | 0|01366 | 1|4 x10

99

slide-100
SLIDE 100

Residual Plots

With a little experience, you can get good at reading residual plots. Here’s an example of linearity with equal variance.

100

slide-101
SLIDE 101

Residual Plots

Example of linearity with unequal variance

101

slide-102
SLIDE 102

Example of Residual Plots

Example of non-linearity with equal variance

102

slide-103
SLIDE 103

103