Workshop 10.4: Generalized linear models Murray Logan August - - PDF document

workshop 10 4 generalized linear models
SMART_READER_LITE
LIVE PREVIEW

Workshop 10.4: Generalized linear models Murray Logan August - - PDF document

-1- Workshop 10.4: Generalized linear models Murray Logan August 16, 2016 Table of contents 1 Exponential family distributions 2 0.1. Linear models Homogeneity of variance 2 . 0 0 . . 2 0 .


slide-1
SLIDE 1
  • 1-

Workshop 10.4: Generalized linear models

Murray Logan

August 16, 2016

Table of contents

1 Exponential family distributions 2

0.1. Linear models

yi = β0 +β1 ×xi

  • Linearity

+εi εi ∼ N (0,. σ2)

  • Normality

. V = cov =      . σ2 ··· σ2 ··· . . . . . . ··· σ2 . . . . ··· ··· σ2      . Homogeneity of variance . Zero covariance (=independence) .

0.2. Other data types

  • Binary - only 0 and 1 (dead/alive) (present/absent)
  • Proportional abundance - range from 0 to 100
  • Count data - min of zero
slide-2
SLIDE 2
  • 2-

0.3. Linear models

  • ● ● ● ● ● ● ● ●
  • ● ● ● ● ● ●

Absent Present 0.0 0.2 0.4 0.6 0.8 1.0 Predicted probability

  • f presence

a) X Frequency 0.0 0.4 0.8 2 4 6 8 10 12 b)

  • expected values outside logical bounds
  • response not normally distributed

0.4. Logistic models

  • ● ● ● ● ● ● ● ●
  • ● ● ● ● ● ●

Absent Present 0.0 0.2 0.4 0.6 0.8 1.0 b) X Frequency 0.0 0.4 0.8 2 4 6 8 10 12 b)

  • expected values outside logical bounds
  • response not normally distributed
  • 1. Exponential family distributions
slide-3
SLIDE 3
  • 3-

1.1. Gaussian distribution

Virtually unbound measurements (weight, lengths etc)

Probability density function

µ = 25, σ2 = 5 µ = 25, σ2 = 2 µ = 10, σ2 = 2 5 10 15 20 25 30 35 40

Cumulative density function

5 10 15 20 25 30 35 40

f (x | µ, σ2) =

1 √ 2σ2πe− (x−µ)2

2σ2

1.2. Binomial distribution

Presence/absence and data bound to the range [0,1]

Probability density function

n = 50 n = 20 n = 3 5 10 15 20 25 30 35 40

Cumulative density function

5 10 15 20 25 30 35 40

f (k | n, p) = (n

p

) pk(1 − p)n−k

slide-4
SLIDE 4
  • 4-

1.3. Poisson distribution

Count data (or count derivatives - like low densities)

Probability density function

λ = 25 λ = 15 λ = 3 5 10 15 20 25 30 35 40

Cumulative density function

5 10 15 20 25 30 35 40

f (x | λ) = e−λλx

x!

1.4. Negative Binomial

Count data (or count derivatives - like low densities)

Probability density function

n = 25 n = 10 n = 1.5 5 10 15 20 25 30 35 40

Cumulative density function

5 10 15 20 25 30 35 40

f (x | µ, ω) = Γ(x+ω)

Γ(ω)x! × µxωω (µ+ω)µ+ω

slide-5
SLIDE 5
  • 5-

1.5. General linear models

yi = β0 +β1 ×xi

  • Linearity

+εi εi ∼ N (0,. σ2)

  • Normality

. V = cov =      . σ2 ··· σ2 ··· . . . . . . ··· σ2 . . . . ··· ··· σ2      . Homogeneity of variance . Zero covariance (=independence) .

E(Y )

Link function

= β0 + β1x1 + ... + βpxp

  • Systematic

+ε, ε ∼ Dist(...)

1.6. General linear models

E(Y )

Link function

= β0 + β1x1 + ... + βpxp

  • Systematic

+ e

Random

  • Random component.

E(Yi) ∼ N(µi, σ2) A nominated distribution (Gaussian, Poisson, Binomial, Gamma, Beta,. . . )

1.7. General linear models

E(Y )

Link function

= β0 + β1x1 + ... + βpxp

  • Systematic

+ e

Random

  • Random component.
  • Systematic component

β0 + β1x1 + ... + βpxp

  • Link function

1.8. Generalized linear models

slide-6
SLIDE 6
  • 6-

Response vari- able Probability Distribu- tion Link function Model name Continuous measurements Gaussian identiy: µ Linear regression Binary,proportions Binomial logit: log ( π 1−π ) Logistic regression probit: 1 √ 2π ∫ α+β.X −∞ exp ( − 1 2 Z2) dZ Probit regression complimentary: log (−log(1 − π)) Logistic regression Quasi-binomial logit: log ( π 1−π ) Logistic regression Counts Poisson log: log µ Poisson regression / log-linear model Negative binomial log ( µ µ−θ ) Negative binomial regression Quasi- poisson log: logµ Poisson regression

1.9. OLS

6 8 10 12 14 Sum of squares

µ=10

Parameter estimates

  • 6

8 10 12 14

  • Parameter estimates
slide-7
SLIDE 7
  • 7-

1.10. Maximum Likelihood

f (x | µ, σ2) =

1 √ 2σ2πe− (x−µ)2

2σ2

lnL(µ, σ2) = − n

2ln(2π) − n 2lnσ2 − 1 2σ2

∑2

i=1(xi − µ)2

Maximum likelihood estimates: ˆ µ = ¯ x = 1

n

∑n

i=1 xi

ˆ σ2 = 1

n

∑n

i=1(xi − ¯

x)2

1.11. Maximum Likelihood

6 8 10 12 14 Log−likelihood

µ=10

Parameter estimates

  • 6

8 10 12 14 6 8 10 12 14 6 8 10 12 14 Parameter estimates