Workshop 4: Statistical modelling intro Murray Logan 10 Mar 2019 - - PowerPoint PPT Presentation

workshop 4 statistical modelling intro
SMART_READER_LITE
LIVE PREVIEW

Workshop 4: Statistical modelling intro Murray Logan 10 Mar 2019 - - PowerPoint PPT Presentation

Workshop 4: Statistical modelling intro Murray Logan 10 Mar 2019 Section 1 Introduction Statistical modelling What is a statistical model? Statistical modelling What is a statistical model? Mathematical model 12 10 8


slide-1
SLIDE 1

Workshop 4: Statistical modelling intro

Murray Logan 10 Mar 2019

slide-2
SLIDE 2
slide-3
SLIDE 3

Section 1 Introduction

slide-4
SLIDE 4

Statistical modelling

What is a statistical model?

slide-5
SLIDE 5

Statistical modelling

What is a statistical model? Mathematical model

  • 1
2 3 4 5 6

x

2 4 6 8 10 12

y

y=2+1.5x

Statistical model

  • 1
2 3 4 5 6

x

2 4 6 8 10 12

y

y=2+1.5x+ε

slide-6
SLIDE 6

Statistical modelling

  • 1
2 3 4 5 6 x 2 4 6 8 10 12 y

y=2+1.5x+ε

What is a statistical model?

  • stoichastic mathematical expression
  • low-dimensional summary
  • relates one or more dependent random

variables to one or more independent variables

slide-7
SLIDE 7

Statistical modelling

A random variable is one whose values depend on a set of random events and are described by a probability distribution

slide-8
SLIDE 8

Statistical modelling

  • 1
2 3 4 5 6 x 2 4 6 8 10 12 y

y=2+1.5x+ε

What is a statistical model?

  • embodies a data generation process along

with the distributional assumptions underlying this generation

  • incorporates uncertainty
  • response = model + error
  • incorporate error (uncertainty)
slide-9
SLIDE 9

Statistical modelling

  • 1
2 3 4 5 6 x 2 4 6 8 10 12 y

y=2+1.5x+ε

What is the purpose of statistical modelling?

  • describe relationships / effects
  • estimate effects
  • predict outcomes
slide-10
SLIDE 10

Statistical models

  • 1
2 3 4 5 6

x

2 4 6 8 10 12

y

y=2+1.5x+ε

How do we estimate model parameters? - Y ∼ β0 + β1X What criterion do we use to assess best fit?

  • Depends on how we assume Y is distributed
slide-11
SLIDE 11

Statistical models

  • 1
2 3 4 5 6

x

2 4 6 8 10 12

y

y=2+1.5x+ε

If we assume Y is drawn from a normal (gaussian) distribution฀

  • Ordinary Least Squares OLS
slide-12
SLIDE 12

Estimation

  • parameters
  • location (mean)
  • spread (variance) - uncertainty
slide-13
SLIDE 13

Estimation

L e a s t s q u a r e s

6 8 10 12 14 Sum of squares

µ=10

Parameter estimates

  • 6

8 10 12 14

  • Parameter estimates
slide-14
SLIDE 14

Estimation

L e a s t s q u a r e s e s t i m a t e s

  • Minimize sum of the squared residuals
  • Solve simultaneous equations

Y X 3 2.5 1 6 2 5.5 3 9 4 8.6 5 12 6

3.0 = β0 × 1 + β1 × 0 + ε1 2.5 = β0 × 1 + β1 × 1 + ε1 6.0 = β0 × 1 + β1 × 2 + ε2 5.5 = β0 × 1 + β1 × 3 + ε3 9.0 = β0 × 1 + β1 × 4 + ε4 8.6 = β0 × 1 + β1 × 5 + ε5 12.0 = β0 × 1 + β1 × 6 + ε6

slide-15
SLIDE 15

Estimation

L e a s t s q u a r e s e s t i m a t e s

  • Minimize sum of the squared residuals
  • Solve simultaneous equations

Provided data (and residuals) Gaussian

  • 1
2 3 4 5 6

x

2 4 6 8 10 12

y

y=2+1.5x+ε

slide-16
SLIDE 16

Gaussian distribution

Probability density function

µ = 25, σ2 = 5 µ = 25, σ2 = 2 µ = 10, σ2 = 2 5 10 15 20 25 30 35 40

f(x | µ, σ2) =

1 √ 2σ2π e− (x−µ)2

2σ2

slide-17
SLIDE 17

Linear model assumptions

  • Normality
  • Homogeneity of variance
  • Linearity
  • Independence

yi = β0 +β1 ×xi

  • Linearity

+εi εi ∼ N (0,σ2)

  • Normality

V = cov =      σ2 ··· σ2 ··· . . . . . . ··· σ2 . . . ··· ··· σ2      Homogeneity of variance Zero covariance (=independence)

slide-18
SLIDE 18

Linear model assumptions

yi = β0 +β1 ×xi

  • Linearity

+εi εi ∼ N (0,σ2)

  • Normality

V = cov =      σ2 ··· σ2 ··· . . . . . . ··· σ2 . . . ··· ··· σ2      Homogeneity of variance Zero covariance (=independence)

What do we do, if the data do not satisfy the assumptions?

slide-19
SLIDE 19

Scale transformations

Leaf length (cm)

10 20 30 40

Frequency log10 Leaf length (cm)

0.0 0.5 1.0 1.5 2.0

Frequency Leaf length (cm)

10 20 30 40

log10 leaf length (cm)

0.0 0.5 1.0 1.5 2.0 Logarithmic scale Linear scale

slide-20
SLIDE 20

Linear model

yi = β0 + β1xi + εi

  • model embodies data generation processes
  • pertains to:
  • effects (linear predictor)
  • distribution
slide-21
SLIDE 21

Data types

Type Example Distribution Range Measurements length, weight Gaussian real, −∞ ≤ x ≥ ∞ logNormal real, 0 < x ≥ ∞ Gamma real, 0 < x ≥ ∞ Counts Abundance Poisson discrete, 0 ≥ x ≤ ∞ Negative Binomial discrete, 0 ≥ x ≤ ∞ Binary Presence/Absence Binomial discrete, x = 0, 1 Proportions Ratio Binomial discrete, 0 ≥ x ≤ n Percentages Percent cover Binomial real, 0 ≤ x ≥ 1 Beta real, 0 ≤ x ≥ 1 What about density?

slide-22
SLIDE 22

Gamma

zero-bound variables with large var.

Probability density function

µ = 15, σ2 = 15 (a = 15, s = 1) µ = 15, σ2 = 30 (a = 7.5, s = 2) µ = 15, σ2 = 60 (a = 3.75, s = 4) 5 10 15 20 25 30 35 40

$f(x| s, a) =

1 (saΓ(a))x(a − 1)e{−(x/s)}$

a shape s scale as as

slide-23
SLIDE 23

Poisson distribution

Count data

Probability density function

λ = 25 λ = 15 λ = 3 5 10 15 20 25 30 35 40

f(x | λ) = e−λλx

x!

df dispersion

slide-24
SLIDE 24

Negative Binomial

Count data

Probability density function

µ = 25, ω = Inf (θ = 0) µ = 15, ω = Inf (θ = 0) µ = 3, ω = Inf (θ = 0) 5 10 15 20 25 30 35 40

Probability density function

µ = 15, ω = 7.5 (θ = 0.133; σ2 = 3µ) µ = 15, ω = 3 (θ = 0.333; σ2 = 6µ) µ = 15, ω = 1.667 (θ = 0.6; σ2 = 10µ) 5 10 15 20 25 30 35 40

f(x | µ, ω) = Γ(x+ω)

Γ(ω)x! × µxωω (µ+ω)µ+ω

dispersion when

slide-25
SLIDE 25

Binomial distribution

Proportions or Presence/absence f(x | n, p) =

(n

p

)

px(1 − p)n−x

µ = np, σ2 = np(1 − p)

for presence/absence n = 1

slide-26
SLIDE 26

Beta

Continuous between 0 and 1

Probability density function

µ = 0.5, σ2 = 0.023 (a = 5, b = 5) µ = 0.167, σ2 = 0.019 (a = 1, b = 5) µ = 0.833, σ2 = 0.019 (a = 5, b = 1) µ = 0.5, σ2 = 0.125 (a = 0.5, b = 0.5) 0.00 0.25 0.50 0.75 1.00

f(x | a, b) =

Γ(a+b) Γ(a)Γ(b)xa−1(1 − x)b−1

µ =

a a+b, σ2 = ab

(a+b)2.(a+b+1)

  • must consider zero-one inflated
slide-27
SLIDE 27

Generalized linear models

Y = β0 + β1x1 + ... + βpxp + e g(µ)

  • Link function

= β0 + β1x1 + ... + βpxp

  • Systematic
  • Random component.

Y ∼ Dist(µ, ...)

  • Systematic component [−∞, ∞]
  • Link function (g())

g(µ) = β0 + β1x1 + ... + βpxp

slide-28
SLIDE 28

Generalized linear models

Linear model is just a special case Y = β0 + β1x1 + ... + βpxp + e g(µ)

  • Link function

= β0 + β1x1 + ... + βpxp

  • Systematic
  • Random component.

Y ∼ N(µ, σ2)

  • Systematic component [−∞, ∞]
  • Link function (g())

I(µ) = β0 + β1x1 + ... + βpxp

slide-29
SLIDE 29

Generalized linear models

Response variable Probability Dis- tribution Canonical Link function Model name Continuous measurements Gaussian identiy:

µ

Linear regression Gamma verse:

1/µ

Gamma regression Counts Poisson log: log(µ) Poisson regression / log-linear model Negative binomial log: log(µ) Negative binomial regression Quasi-poisson log: log(µ) Poisson regression Binary,proportions Binomial logit: log

(

π 1−π

)

Logistic regression probit: 1 √ 2π

∫ α+β.X

−∞ exp

( − 1

2 Z2) dZ Probit regression complimentary: log (−log(1 − π)) Logistic regression Quasi-binomial logit: log

(

π 1−π

)

Logistic regression Percentages Beta logit: log

(

π 1−π

)

Beta regression

slide-30
SLIDE 30

OLS

6 8 10 12 14 Sum of squares

µ=10

Parameter estimates

  • 6

8 10 12 14

  • Parameter estimates
slide-31
SLIDE 31

Maximum Likelihood

f(x | µ, σ2) =

1 √ 2σ2π e− (x−µ)2

2σ2

lnL(µ, σ2) = − n

2ln(2π) − n 2lnσ2 − 1 2σ2

∑2

i=1(xi − µ)2

Maximum likelihood estimates:

ˆ µ = ¯

x = 1

n

∑n

i=1 xi

ˆ σ2 = 1

n

∑n

i=1(xi − ¯

x)2

slide-32
SLIDE 32

Maximum Likelihood

6 8 10 12 14 Log−likelihood

µ=10

Parameter estimates

  • 6

8 10 12 14 6 8 10 12 14 6 8 10 12 14 Parameter estimates