SLIDE 31 Subset size 60 65 70 75 80 85 5 10 15 20 25
Time,x1,x2 Time,x2,x5 Time,x4,x5 Time,x4,x7 Time,x7,x8 p=4 K=10
Subset size 60 65 70 75 80 85 5 10 15 20 25
Time,x1,x2,x5 Time,x2,x3,x5 Time,x2,x4,x5 Time,x2,x5,x6 Time,x2,x5,x7 Time,x2,x5,x8 Time,x2,x5,x8 Time,x3,x4,x5 Time,x4,x5,x6 p=5 K=10
Subset size 60 65 70 75 80 85 5 10 15 20 25
Time,x2,x3,x4,x5 Time,x2,x4,x5,x8 Time,x2,x5,x6,x8 Time,x4,x5,x6,x7 Time,x4,x5,x6,x8 Time,x2,x4,x5,x6 p=6 K=10
Subset size 60 65 70 75 80 85 5 10 15 20 25
Time,x1,x2,x4,x5,x6 Time,x1,x2,x5,x6,x8 Time,x2,x3,x4,x5,x6 Time,x2,x4,x5,x6,x7 Time,x2,x4,x5,x6,x8 Time,x4,x5,x6,x7,x8 p=7 K=10
Figure 44: Ozone data: forward plots of Cp(m) when p = 4, 5, 6 and 7. The last two observations to enter the subset have a clear effect on model choice this is only the third best model of this size when m = n. This plot clearly and elegantly shows how the choice of model is being influenced by the last two observations to enter the forward search.
4.12 Outlier Detection
The last two observations to enter S∗(m) are 56 and 65; these also seem to be outlying in the plot of residuals against trend in Figure 3.36 of Atkinson and Riani (2000). To detect outliers we calculate the deletion residual for the n − m observations not in S(m)
∗
. These residuals are ri∗(m) = yi − xT
i ˆ
β∗(m)
∗(m){1 + hi∗(m)}
= ei∗(m)
∗(m){1 + hi∗(m)}
, (26) where hi∗(m) = xT
i {X∗(m)TX∗(m)}−1xi; the leverage of each observation
depends on S(m)
∗
. Let imin denote the observation with the minimum absolute deletion residual among those not in S(m)
∗
, that is imin = arg min
i/ ∈S(m)
∗
|ri∗(m)|. 31