[PPT] - Second-Order Effect of Estimated Weights Recall the general PowerPoint Presentation

SLIDE 1

ST 762 Nonlinear Statistical Models for Univariate and Multivariate Response

Second-Order Effect of Estimated Weights

Recall the general mean-variance specification E(Y |x) = f (x, β), var(Y |x) = σ2g(β, θ, x)2. The large sample approximate distribution of the GLS estimator is √n

ˆ

βGLS − β0

L

− → N

0, σ2

0ΣWLS

where

ΣWLS =

lim

n→∞ n−1XTWX

−1 , if the working variance function is the true variance function.

1 / 15 Second-Order Effect of Estimated Weights

SLIDE 2

ST 762 Nonlinear Statistical Models for Univariate and Multivariate Response

The folklore theorem says the asymptotic distribution is the same no matter if the weights are known (β and θ are held equal to their true values), or if the weights are estimated (β and θ are estimated by √n-consistent estimators ˆ β and ˆ θ). regardless of the number of iterations of the GLS algorithm (the same for all values of C). Such results hold only to this order of approximation (first order). The standard errors obtained this way tend to understate the variability associated with ˆ βGLS, particularly for smaller n.

2 / 15 Second-Order Effect of Estimated Weights

SLIDE 3

ST 762 Nonlinear Statistical Models for Univariate and Multivariate Response

This level of approximation suggests that var

ˆ

βGLS

≈ 1

nσ2

0ΣWLS

Or say, var

ˆ

βGLS

= 1

nσ2

0ΣWLS + op

1 n

.

We can sometimes go further to a more refined approximation: var

ˆ

βGLS

= 1

nσ2

0ΣWLS + 1

n2V + o 1 n2

,

where the second-order term n−2V may capture the effect of using estimated weights and the choice of number of iterations.

3 / 15 Second-Order Effect of Estimated Weights

SLIDE 4

ST 762 Nonlinear Statistical Models for Univariate and Multivariate Response

Generally, arguments to establish such second order results are very

tedious. So we will pursue some simple, special cases.

Throughout, we assume the variance function g(·) is correctly specified, as our focus is on understanding the performance of the first order and second order results when the model is correct. The second order results can provide some useful theoretical insight. However, they do not translate into improvements that may be used in practice, as the necessary calculations are much too difficult to be implemented easily, even for those simple special cases. Later, we will consider the bootstrap as an alternative way of effecting the same sort of improvement “automatically” under certain circumstances.

4 / 15 Second-Order Effect of Estimated Weights

SLIDE 5

ST 762 Nonlinear Statistical Models for Univariate and Multivariate Response

Case I When g(·) does not depend on β [Rothenberg, 1984] Assumptions g(·) does not depend on β; The variance parameter θ is a scalar, and is estimated by ˆ θ

·

∼ N(θ0, τ 2/n); Linear model E(Yj|xj) = xT

j β, and Gaussian distribution of Yj

given xj.

5 / 15 Second-Order Effect of Estimated Weights

SLIDE 6

ST 762 Nonlinear Statistical Models for Univariate and Multivariate Response

Then var

ˆ

βGLS

= 1

nσ2

0ΣWLS + 1

n2V + o 1 n2

,

and V is an increasing function of τ 2. Heuristic argument shows that V = τ 2ΣD, but ΣD is hard to calculate. The number of iterations C appears not to matter here.

6 / 15 Second-Order Effect of Estimated Weights

SLIDE 7

ST 762 Nonlinear Statistical Models for Univariate and Multivariate Response

The precision of estimation of β by ˆ βGLS is dictated by the precision

f estimation of θ. In particular, the more precise ˆ

θ is, the more precise ˆ βGLS is, to second order. The role of estimation of θ only shows up in the second order term n−2V; for large n, this term is dominated by the leading term, and its effect is negligible. For small n, however, the effect may be more pronounced. The form of V may be very difficult to derive, so it may not be practical to use this additional correction term to calculate improved standard errors of ˆ βGLS.

7 / 15 Second-Order Effect of Estimated Weights

SLIDE 8

ST 762 Nonlinear Statistical Models for Univariate and Multivariate Response

Case II When g(·) depends on β [Carroll, Wu, and Ruppert, 1988] Weaker assumptions g(·) may depend on β, i.e., a general g(β, θ, x); The variance parameter θ does not have to be a scalar, and is estimated by a “reasonable” estimator ˆ θ; Mean model need not be linear, and errors need not be Gaussian. Then the C-step estimator ˆ β

(C) GLS satisfies

var

ˆ

β

(C) GLS

= 1

nσ2

0ΣWLS + 1

n2V(C) + o 1 n2

,

where the second-order term V(C) may depend on C.

8 / 15 Second-Order Effect of Estimated Weights

SLIDE 9

ST 762 Nonlinear Statistical Models for Univariate and Multivariate Response

In general, V(3) = V(4) = · · · = V(∞). In addition, V(2) = V(3) = · · · = V(∞) if either g(·) does not depend on β and the errors are symmetrically distributed; ˆ β

(0) GLS is ˆ

βOLS. In addition, V(1) = V(2) = · · · = V(∞) if both the above conditions hold.

9 / 15 Second-Order Effect of Estimated Weights

SLIDE 10

ST 762 Nonlinear Statistical Models for Univariate and Multivariate Response

When V(1) = V(3), there is no general ordering; both V(1) > V(3) and V(1) < V(3) are possible. If g(·) does not depend on β and var

ǫ2

j

= 2 + κ for all j, then for

all C and for some matrix V∗, V(C) = (2 − κ)V∗.

10 / 15 Second-Order Effect of Estimated Weights

SLIDE 11

ST 762 Nonlinear Statistical Models for Univariate and Multivariate Response

There is no “optimal” number of iterations of the GLS algorithm, in a second order sense: It could well be that iterating past C = 1 could be detrimental! The usual practice of taking C = ∞ could be suboptimal in some situations. After C = 3 in general, or after C = 2 or 1, additional iteration has, to second order, no effect. The form of V(C) is always complicated, and again bootstrap variance estimation is preferred. Main point From a second order perspective, how one estimates θ does matter! This has motivated research into determining the “best” way to estimate θ under different circumstances.

11 / 15 Second-Order Effect of Estimated Weights

SLIDE 12

ST 762 Nonlinear Statistical Models for Univariate and Multivariate Response

Recall This approach is based on the first-order approximation that √n

ˆ

βGLS − β0

converges in law to some limiting Gaussian distribution.

The central limit theorem does not need to be written in this way. All we really need in practice is that ˆ βGLS

·

∼ N (β0, Σn) where Σn is some matrix we can calculate.

12 / 15 Second-Order Effect of Estimated Weights

SLIDE 13

ST 762 Nonlinear Statistical Models for Univariate and Multivariate Response

Outline of bootstrap Depends on the assumption that ǫj = Yj − f (xj, β) g (β, θ, xj) , 1 ≤ j ≤ n, are i.i.d. (note: not true for Poisson regression, for example). At step C + 1, get residuals sj = Yj − f

xj, ˆ

β

(C+1)

g

ˆ

β

(C+1), ˆ

θ

(C), xj

.

13 / 15 Second-Order Effect of Estimated Weights

SLIDE 14

ST 762 Nonlinear Statistical Models for Univariate and Multivariate Response

For b = 1, 2, . . . , B: sample

sb

1 , sb 2 , . . . , sb n

with replacement from the finite

population {s1, s2, . . . , sn} Form the bootstrap responses Y b

j = f

xj, ˆ

β

(C+1)

+ g

ˆ

β

(C+1), ˆ

θ

(C), xj

sb

j .

Get the bootstrap estimate ˆ β

b from these responses, using the

C + 1 iterate.

14 / 15 Second-Order Effect of Estimated Weights

SLIDE 15

ST 762 Nonlinear Statistical Models for Univariate and Multivariate Response

Summarize these B bootstrap estimates as ˆ βboot = 1 B

B

b=1

ˆ β

b,

ˆ Σboot = 1 B − 1

B

b=1
ˆ

β

b − ˆ

βboot ˆ β

b − ˆ

βboot T . A heuristic argument shows that, when the second-order approximation works, ˆ Σboot estimates the sum of both the first-order and second-order terms in var

ˆ

β

(C+1)

. That is, it does “automatic” adjustment for using estimated weights.

15 / 15 Second-Order Effect of Estimated Weights