Generalized Linear Model Certain nonlinear models with a specific - - PowerPoint PPT Presentation

generalized linear model
SMART_READER_LITE
LIVE PREVIEW

Generalized Linear Model Certain nonlinear models with a specific - - PowerPoint PPT Presentation

ST 762 Nonlinear Statistical Models for Univariate and Multivariate Response Generalized Linear Model Certain nonlinear models with a specific structure arise from using linear modeling with a parent distribution in the exponential family. If


slide-1
SLIDE 1

ST 762 Nonlinear Statistical Models for Univariate and Multivariate Response

Generalized Linear Model

Certain nonlinear models with a specific structure arise from using linear modeling with a parent distribution in the exponential family. If the linear part is replaced by a more general nonlinear specification, the result is a special case of our general mean-variance specification E(Y |x) = f (x, β), var(Y |x) = σ2g(β, θ, x)2. Estimation may also be carried out using the GLS estimation equations.

1 / 10

slide-2
SLIDE 2

ST 762 Nonlinear Statistical Models for Univariate and Multivariate Response

The (Scaled) Exponential Family

Y has a scaled exponential family distribution if its density (or probability mass function) is of the form f (y; ξ, σ) = exp yξ − b(ξ) σ2 + c(y, σ)

  • .

ξ is the canonical parameter, and σ is the scale parameter. If σ2 is known, this is the usual one-parameter exponential family with canonical parameter ξ. If σ2 is unknown, it may or may not be the usual two-parameter exponential family.

2 / 10

slide-3
SLIDE 3

ST 762 Nonlinear Statistical Models for Univariate and Multivariate Response

Moments: E(Y ) = bξ(ξ) = db(ξ) dξ , var(Y ) = σ2bξξ(ξ) = σ2d2b(ξ) dξ2 . If E(Y ) = µ = bξ(ξ), then ξ = b−1

ξ (µ).

The function b−1

ξ (·) is called the canonical link function, because it

links the canonical parameter ξ to the mean µ.

3 / 10

slide-4
SLIDE 4

ST 762 Nonlinear Statistical Models for Univariate and Multivariate Response

Also var(Y ) = σ2bξξ

  • b−1

ξ (µ)

  • = σ2g(µ)2,

so the variance depends on the mean in a specific way. Examples of the scaled exponential family: Distribution b(ξ) ξ(µ) g(µ)2 Normal, σ2 = 1 ξ2/2 µ 1 Poisson exp(ξ) log µ µ Gamma − log(−ξ) 1/µ µ2 Inverse Gaussian −√−2ξ 1/µ2 µ3 Binomial log

  • 1 + eξ

log

µ 1−µ

µ(1 − µ)

4 / 10

slide-5
SLIDE 5

ST 762 Nonlinear Statistical Models for Univariate and Multivariate Response

Sufficiency

If Y1, Y2, . . . , Yn is a random sample from a member of this family, the log-likelihood is log L =

n

  • j=1

Yjξ − b(ξ) σ2 + c(Yj, σ)

  • = 1

σ2

  • ξ

n

  • j=1

Yj − nb(ξ)

  • +

n

  • j=1

c (Yj, σ) so (if σ2 is known) Yj is sufficient for ξ.

5 / 10

slide-6
SLIDE 6

ST 762 Nonlinear Statistical Models for Univariate and Multivariate Response

Also, if Y1, Y2, . . . , Yn are independent, but in the distribution of Yj, ξ is replaced by ξj = xT

j β, the log-likelihood is

log L = 1 σ2  

  • n
  • j=1

Yjxj T β −

n

  • j=1

b(ξj)   +

n

  • j=1

c (Yj, σ) so now Yjxj is sufficient for β. But note that E (Yj| xj) = µj = bξ (ξj) = bξ

  • xT

j β

  • ,

so this is a conventional linear model only if bξ(ξ) = ξ, i.e., for the normal distribution. Otherwise, it is a generalized linear model.

6 / 10

slide-7
SLIDE 7

ST 762 Nonlinear Statistical Models for Univariate and Multivariate Response

Note that bξ(·) is determined by the distribution. We can replace it by a different function E (Yj| xj) = f

  • xT

j β

  • ,

and it is still called a generalized linear model. Because the link f −1(·) is no longer the canonical link, we lose sufficiency–not a big deal. R and SAS support fitting these models with the link function chosen from a list.

7 / 10

slide-8
SLIDE 8

ST 762 Nonlinear Statistical Models for Univariate and Multivariate Response

Example: Six Cities Wheezing data

Response: child wheezes at age 9 (0 or 1). Predictor: mother’s smoking status (0 = none, 1 = moderate, 2 = heavy). Possible covariate: community (Portage or Kingston).

8 / 10

slide-9
SLIDE 9

ST 762 Nonlinear Statistical Models for Univariate and Multivariate Response

Model: Yj ∼ Bernoulli (µj) . Canonical link: log

  • µj

1 − µj

  • = xT

j β

  • r

E (Yj |xj ) = µj = exp

  • xT

j β

  • 1 + exp
  • xT

j β

  • Logistic regression.

Alternative link: probit function, µj = Φ

  • xT

j β

  • .

9 / 10

slide-10
SLIDE 10

ST 762 Nonlinear Statistical Models for Univariate and Multivariate Response

Generalized Nonlinear Model

We may want a more general specification for the conditional mean: E (Yj |xj ) = f (xj, β) . This is consistent with the scaled exponential family if ξj satisfies bξ (ξj) = f (xj, β) . The mean-variance relationship is still determined by the distribution: var (Yj |xj ) = σ2g {E (Yj |xj )}2 = σ2g {f (xj, β)}2 .

10 / 10