Workshop 4: Statistical modelling intro Murray Logan March 10, 2019 - - PDF document

workshop 4 statistical modelling intro
SMART_READER_LITE
LIVE PREVIEW

Workshop 4: Statistical modelling intro Murray Logan March 10, 2019 - - PDF document

-1- Workshop 4: Statistical modelling intro Murray Logan March 10, 2019 Table of contents 1 Introduction 1 1. Introduction 1.1. Statistical modelling What is a statistical model? 1.2. Statistical modelling What is a statistical model?


slide-1
SLIDE 1
  • 1-

Workshop 4: Statistical modelling intro

Murray Logan

March 10, 2019

Table of contents

1 Introduction 1

  • 1. Introduction

1.1. Statistical modelling

What is a statistical model?

1.2. Statistical modelling

What is a statistical model? Mathematical model

  • 1

2 3 4 5 6

x

2 4 6 8 10 12

y

y=2+1.5x

Statistical model

  • 1

2 3 4 5 6

x

2 4 6 8 10 12

y

y=2+1.5x+ε

slide-2
SLIDE 2
  • 2-

1.3. Statistical modelling

  • 1

2 3 4 5 6

x

2 4 6 8 10 12

y

y=2+1.5x+ε What is a statistical model?

  • stoichastic mathematical expression
  • low-dimensional summary
  • relates one or more dependent random variables to one or more independent variables

1.4. Statistical modelling

A random variable is one whose values depend on a set of random events and are described by a probability distribution

1.5. Statistical modelling

  • 1

2 3 4 5 6

x

2 4 6 8 10 12

y

y=2+1.5x+ε What is a statistical model?

  • embodies a data generation process along with the distributional assumptions underlying

this generation

  • incorporates uncertainty
  • response = model + error
  • incorporate error (uncertainty)

1.6. Statistical modelling

  • 1

2 3 4 5 6

x

2 4 6 8 10 12

y

y=2+1.5x+ε What is the purpose of statistical modelling?

slide-3
SLIDE 3
  • 3-
  • describe relationships / effects
  • estimate effects
  • predict outcomes

1.7. Statistical models

  • 1

2 3 4 5 6

x

2 4 6 8 10 12

y

y=2+1.5x+ε

How do we estimate model parameters? - Y ∼ β0 + β1X What criterion do we use to assess best fit?

  • Depends on how we assume Y is distributed

1.8. Statistical models

  • 1

2 3 4 5 6

x

2 4 6 8 10 12

y

y=2+1.5x+ε

If we assume Y is drawn from a normal (gaussian) distribution. . .

  • Ordinary Least Squares OLS

1.9. Estimation

  • parameters

– location (mean) – spread (variance) - uncertainty

slide-4
SLIDE 4
  • 4-

1.10. Estimation

1.10.1. Least squares

6 8 10 12 14 Sum of squares

µ=10

Parameter estimates

  • 6

8 10 12 14

  • Parameter estimates

1.11. Estimation

1.11.1. Least squares estimates

  • Minimize sum of the squared residuals
  • Solve simultaneous equations

Y X 3 2.5 1 6 2 5.5 3 9 4 8.6 5 12 6 3.0 = β0 × 1 + β1 × 0 + ε1 2.5 = β0 × 1 + β1 × 1 + ε1 6.0 = β0 × 1 + β1 × 2 + ε2 5.5 = β0 × 1 + β1 × 3 + ε3 9.0 = β0 × 1 + β1 × 4 + ε4 8.6 = β0 × 1 + β1 × 5 + ε5 12.0 = β0 × 1 + β1 × 6 + ε6

1.12. Estimation

1.12.1. Least squares estimates

  • Minimize sum of the squared residuals
  • Solve simultaneous equations

Provided data (and residuals) Gaussian

slide-5
SLIDE 5
  • 5-
  • 1

2 3 4 5 6

x

2 4 6 8 10 12

y

y=2+1.5x+ε

1.13. Gaussian distribution

Probability density function

µ = 25, σ2 = 5 µ = 25, σ2 = 2 µ = 10, σ2 = 2 5 10 15 20 25 30 35 40

f (x | µ, σ2) =

1 √ 2σ2πe− (x−µ)2

2σ2

1.14. Linear model assumptions

  • Normality
  • Homogeneity of variance
  • Linearity
  • Independence
slide-6
SLIDE 6
  • 6-

yi = β0 +β1 ×xi

  • Linearity

+εi εi ∼ N (0,σ2)

  • Normality

V = cov =      σ2 ··· σ2 ··· . . . . . . ··· σ2 . . . ··· ··· σ2      Homogeneity of variance Zero covariance (=independence)

1.15. Linear model assumptions

yi = β0 +β1 ×xi

  • Linearity

+εi εi ∼ N (0,σ2)

  • Normality

V = cov =      σ2 ··· σ2 ··· . . . . . . ··· σ2 . . . ··· ··· σ2      Homogeneity of variance Zero covariance (=independence)

What do we do, if the data do not satisfy the assumptions?

slide-7
SLIDE 7
  • 7-

1.16. Scale transformations

Leaf length (cm)

10 20 30 40

Frequency log10 Leaf length (cm)

0.0 0.5 1.0 1.5 2.0

Frequency Leaf length (cm)

10 20 30 40

log10 leaf length (cm)

0.0 0.5 1.0 1.5 2.0

Logarithmic scale Linear scale

1.17. Linear model

yi = β0 + β1xi + εi

  • model embodies data generation processes
  • pertains to:
  • effects (linear predictor)
  • distribution

1.18. Data types

Type Example Distribution Range Measurements length, weight Gaussian real, −∞ ≤ x ≥ ∞ logNormal real, 0 < x ≥ ∞ Gamma real, 0 < x ≥ ∞ Counts Abundance Poisson discrete, 0 ≥ x ≤ ∞ Negative Binomial discrete, 0 ≥ x ≤ ∞ Binary Presence/Absence Binomial discrete, x = 0, 1 Proportions Ratio Binomial discrete, 0 ≥ x ≤ n Percentages Percent cover Binomial real, 0 ≤ x ≥ 1 Beta real, 0 ≤ x ≥ 1

slide-8
SLIDE 8
  • 8-

What about density?

1.19. Gamma

zero-bound variables with large var.

Probability density function

µ = 15, σ2 = 15 (a = 15, s = 1) µ = 15, σ2 = 30 (a = 7.5, s = 2) µ = 15, σ2 = 60 (a = 3.75, s = 4) 5 10 15 20 25 30 35 40

$f(x| s, a) =

1 (saΓ(a))x

Θ (a − 1)e Θ {−(x/s)}$ a = shape, s = scale µ = as, σ2 = as2

1.20. Poisson distribution

Count data

slide-9
SLIDE 9
  • 9-

Probability density function

λ = 25 λ = 15 λ = 3 5 10 15 20 25 30 35 40

f (x | λ) = e−λλx

x!

µ = σ2 = λ = df θ (dispersion) = σ2

µ = 1

1.21. Negative Binomial

Count data

slide-10
SLIDE 10
  • 10-

Probability density function

µ = 25, ω = Inf (θ = 0) µ = 15, ω = Inf (θ = 0) µ = 3, ω = Inf (θ = 0) 5 10 15 20 25 30 35 40

Probability density function

µ = 15, ω = 7.5 (θ = 0.133; σ2 = 3µ) µ = 15, ω = 3 (θ = 0.333; σ2 = 6µ) µ = 15, ω = 1.667 (θ = 0.6; σ2 = 10µ) 5 10 15 20 25 30 35 40

f (x | µ, ω) = Γ(x+ω)

Γ(ω)x! × µxωω (µ+ω)µ+ω

θ (dispersion) = 1/ω ω = −

µ2 µ−σ2

θ = 0 (when ω = ∞)

1.22. Binomial distribution

Proportions or Presence/absence f (x | n, p) = (n

p

) px(1 − p)n−x µ = np, σ2 = np(1 − p) for presence/absence n = 1

1.23. Beta

Continuous between 0 and 1

slide-11
SLIDE 11
  • 11-

Probability density function

µ = 0.5, σ2 = 0.023 (a = 5, b = 5) µ = 0.167, σ2 = 0.019 (a = 1, b = 5) µ = 0.833, σ2 = 0.019 (a = 5, b = 1) µ = 0.5, σ2 = 0.125 (a = 0.5, b = 0.5) 0.00 0.25 0.50 0.75 1.00

f (x | a, b) =

Γ(a+b) Γ(a)Γ(b)xa−1(1 − x)b−1

µ =

a a+b, σ2 = ab (a+b)2.(a+b+1)

  • must consider zero-one inflated

1.24. Generalized linear models

Y = β0 + β1x1 + ... + βpxp + e g(µ)

  • Link

function

= β0 + β1x1 + ... + βpxp

  • Systematic
  • Random component.

Y ∼ Dist(µ, ...)

  • Systematic component [−∞, ∞]
  • Link function (g())

g(µ) = β0 + β1x1 + ... + βpxp

1.25. Generalized linear models

Linear model is just a special case Y = β0 + β1x1 + ... + βpxp + e g(µ)

  • Link

function

= β0 + β1x1 + ... + βpxp

  • Systematic
  • Random component.

Y ∼ N(µ, σ2)

  • Systematic component [−∞, ∞]
  • Link function (g())

I(µ) = β0 + β1x1 + ... + βpxp

slide-12
SLIDE 12
  • 12-

1.26. Generalized linear models

Response variable Probability Distribution Canonical Link function Model name Continuous measurements Gaussian identiy: µ Linear regression Gamma verse: 1/µ Gamma regression Counts Poisson log: log(µ) Poisson regression / log-linear model Negative bino- mial log: log(µ) Negative binomial regression Quasi-poisson log: log(µ) Poisson regression Binary,proportions Binomial logit: log ( π 1−π ) Logistic regression probit: 1 √ 2π ∫ α+β.X −∞ exp ( − 1 2 Z2) dZ Probit regression complimentary: log (−log(1 − π)) Logistic regression Quasi-binomial logit: log ( π 1−π ) Logistic regression Percentages Beta logit: log ( π 1−π ) Beta regression

1.27. OLS

6 8 10 12 14 Sum of squares

µ=10

Parameter estimates

  • 6

8 10 12 14

  • Parameter estimates
slide-13
SLIDE 13
  • 13-

1.28. Maximum Likelihood

f (x | µ, σ2) =

1 √ 2σ2πe− (x−µ)2

2σ2

lnL(µ, σ2) = − n

2ln(2π) − n 2lnσ2 − 1 2σ2

∑2

i=1(xi − µ)2

Maximum likelihood estimates: ˆ µ = ¯ x = 1

n

∑n

i=1 xi

ˆ σ2 = 1

n

∑n

i=1(xi − ¯

x)2

1.29. Maximum Likelihood

6 8 10 12 14 Log−likelihood

µ=10

Parameter estimates

  • 6

8 10 12 14 6 8 10 12 14 6 8 10 12 14 Parameter estimates