Some Statistical Problems in Climate Reconstruction Dan Cervone - - PowerPoint PPT Presentation

some statistical problems in climate reconstruction
SMART_READER_LITE
LIVE PREVIEW

Some Statistical Problems in Climate Reconstruction Dan Cervone - - PowerPoint PPT Presentation

Some Statistical Problems in Climate Reconstruction Dan Cervone April 15, 2014 Dan Cervone () STAT 300: Research in Statistics April 15, 2014 Historical Global Temperature Reconstruction Data: CRUTEMv3 Northern hemisphere temperature


slide-1
SLIDE 1

Some Statistical Problems in Climate Reconstruction

Dan Cervone April 15, 2014

Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-2
SLIDE 2

Historical Global Temperature Reconstruction

Data: CRUTEMv3

1850 1900 1950 2000 −1.0 0.0 1.0

Northern hemisphere temperature anomolies

Temp (°C) Year Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-3
SLIDE 3

Historical Global Temperature Reconstruction

Data: CRUTEMv3

1850 1900 1950 2000 −1.0 0.0 1.0

Northern hemisphere temperature anomolies

Temp (°C) Year Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-4
SLIDE 4

Historical Global Temperature Reconstruction

Data: CRUTEMv3

1850 1900 1950 2000 −1.0 0.0 1.0

Northern hemisphere temperature anomolies

Temp (°C) Year 1850 1900 1950 2000 50 150 250

Northern hemisphere temperature sites

Number of sites Year Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-5
SLIDE 5

Historical Global Temperature Reconstruction

Data: CRUTEMv3

1850 1900 1950 2000 −1.0 0.0 1.0

Northern hemisphere temperature anomolies

Temp (°C) Year 1850 1900 1950 2000 50 150 250

Northern hemisphere temperature sites

Number of sites Year

What is the estimand? Interpolate gaps in

  • bservational record

Extrapolate before 1850

Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-6
SLIDE 6

Historical Global Temperature Reconstruction

Data: CRUTEMv3

1850 1900 1950 2000 −1.0 0.0 1.0

Northern hemisphere temperature anomolies

Temp (°C) Year 1850 1900 1950 2000 50 150 250

Northern hemisphere temperature sites

Number of sites Year

What is the estimand? Interpolate gaps in

  • bservational record

Extrapolate before 1850

1400 1500 1600 1700 1800 1900 2000 −1.0 0.0 1.0

Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-7
SLIDE 7

All the moments each moment

Image: NASA/GISS

Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-8
SLIDE 8

[image source: wikimedia commons]

Proxies:

18O/16O, ocean sediment

Ice cores Varves (rock sediment) Tree rings

Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-9
SLIDE 9

Example: BARCAST

Tingley & Huybers 2010, 2013

Spatiotemporal temperature reconstruction using temperature record and proxies: Tt = To,t Tp,t

  • at locations S =

So Sp

  • Dan Cervone

() STAT 300: Research in Statistics April 15, 2014

slide-10
SLIDE 10

Example: BARCAST

Tingley & Huybers 2010, 2013

Spatiotemporal temperature reconstruction using temperature record and proxies: Tt = To,t Tp,t

  • at locations S =

So Sp

  • To are temperatures at locations of temperature records So.

Tp are temperatures at locations of proxy records Sp.

Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-11
SLIDE 11

Example: BARCAST

Tingley & Huybers 2010, 2013

Spatiotemporal temperature reconstruction using temperature record and proxies: Tt = To,t Tp,t

  • at locations S =

So Sp

  • To are temperatures at locations of temperature records So.

Tp are temperatures at locations of proxy records Sp. With t indexing years, Tt − µ1 = α(Tt−1 − µ1) + ǫt ǫt

iid

∼ N(0, K(S, S)) K(s, s∗) = τ 2 exp(−γ||s − s∗||2)

Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-12
SLIDE 12

Example: BARCAST

Tingley & Huybers 2010, 2013

“Errors in variables”: True temperatures T are not observed. Measurement error for temperature sites Wo,t ∼ N(To,t, σ2

  • I).

Linear model for proxies Wp,t ∼ N(µp1 + Tp,tβp, σ2

pI).

Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-13
SLIDE 13

Example: BARCAST

Tingley & Huybers 2010, 2013

“Errors in variables”: True temperatures T are not observed. Measurement error for temperature sites Wo,t ∼ N(To,t, σ2

  • I).

Linear model for proxies Wp,t ∼ N(µp1 + Tp,tβp, σ2

pI).

(W T)′ is just a huge multivariate normal!

Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-14
SLIDE 14

Example: BARCAST

Tingley & Huybers 2010, 2013

“Errors in variables”: True temperatures T are not observed. Measurement error for temperature sites Wo,t ∼ N(To,t, σ2

  • I).

Linear model for proxies Wp,t ∼ N(µp1 + Tp,tβp, σ2

pI).

(W T)′ is just a huge multivariate normal! Inference with Gibbs sampling or EM: Update latent T. Update parameters τ 2, γ, µ, α, µp, βp, σ2

  • , σ2

p.

Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-15
SLIDE 15

Example: BARCAST

Tingley & Huybers 2010, 2013

Difficulties: Spatiotemporal nonstationarity and anisotropy.

Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-16
SLIDE 16

Example: BARCAST

Tingley & Huybers 2010, 2013

Difficulties: Spatiotemporal nonstationarity and anisotropy. Model inhomogeneity.

Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-17
SLIDE 17

Example: BARCAST

Tingley & Huybers 2010, 2013

Difficulties: Spatiotemporal nonstationarity and anisotropy. Model inhomogeneity. Uncertainty in spatial referencing.

Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-18
SLIDE 18

Example: BARCAST

Tingley & Huybers 2010, 2013

Difficulties: Spatiotemporal nonstationarity and anisotropy. Model inhomogeneity. Uncertainty in spatial referencing. ...

Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-19
SLIDE 19

Location uncertainty

temperature site ice core tree ring varve

Tree locations uncertain for many older specimens Ice cores subject to glacial flow

Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-20
SLIDE 20

Gaussian Processes

For s ∈ S, X(s) ∼ GP(0, K(s, s)) means for any s1, . . . sp ∈ S,    X(s1) . . . X(sp)    ∼ N   0,    K(s1, s1) . . . K(s1, sp) . . . ... K(sp, s1) K(sp, sp)       , K( , ) is a covariance function, e.g. K(s, s∗) = τ 2 exp(−γ||s − s∗||2).

Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-21
SLIDE 21

Gaussian Processes

For s ∈ S, X(s) ∼ GP(0, K(s, s)) means for any s1, . . . sp ∈ S,    X(s1) . . . X(sp)    ∼ N   0,    K(s1, s1) . . . K(s1, sp) . . . ... K(sp, s1) K(sp, sp)       , K( , ) is a covariance function, e.g. K(s, s∗) = τ 2 exp(−γ||s − s∗||2). Interpolation of X at unobserved location s∗

Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-22
SLIDE 22

Gaussian Processes

For s ∈ S, X(s) ∼ GP(0, K(s, s)) means for any s1, . . . sp ∈ S,    X(s1) . . . X(sp)    ∼ N   0,    K(s1, s1) . . . K(s1, sp) . . . ... K(sp, s1) K(sp, sp)       , K( , ) is a covariance function, e.g. K(s, s∗) = τ 2 exp(−γ||s − s∗||2). Interpolation of X at unobserved location s∗ X(s∗)|X(s) ∼ N(K(s∗, s)K(s, s)−1X(s), K(s∗, s∗) − K(s∗, s)K(s, s)−1K(s, s∗))

Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-23
SLIDE 23

Gaussian Processes

For s ∈ S, X(s) ∼ GP(0, K(s, s)) means for any s1, . . . sp ∈ S,    X(s1) . . . X(sp)    ∼ N   0,    K(s1, s1) . . . K(s1, sp) . . . ... K(sp, s1) K(sp, sp)       , K( , ) is a covariance function, e.g. K(s, s∗) = τ 2 exp(−γ||s − s∗||2). Interpolation of X at unobserved location s∗ X(s∗)|X(s) ∼ N(K(s∗, s)K(s, s)−1X(s), K(s∗, s∗) − K(s∗, s)K(s, s)−1K(s, s∗)) Kriging: BLUP without normality assumption

Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-24
SLIDE 24

GP interpolation with location error

Errors in variables: X(s∗)|X(s) ∼ N(b′X(s), v2) Observe ˜ X(s) = X(s) + ǫ where ǫ ⊥ X(s).

Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-25
SLIDE 25

GP interpolation with location error

Errors in variables: X(s∗)|X(s) ∼ N(b′X(s), v2) Observe ˜ X(s) = X(s) + ǫ where ǫ ⊥ X(s). Still a regression problem: X(s∗)| ˜ X(s) ∼ N(˜ b′ ˜ X(s), ˜ v2)

Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-26
SLIDE 26

GP interpolation with location error

Errors in variables: X(s∗)|X(s) ∼ N(b′X(s), v2) Observe ˜ X(s) = X(s) + ǫ where ǫ ⊥ X(s). Still a regression problem: X(s∗)| ˜ X(s) ∼ N(˜ b′ ˜ X(s), ˜ v2) Berkson errors: ǫ ⊥ ˜ X(s) (not satisfied)

Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-27
SLIDE 27

GP interpolation with location error

Errors in variables: X(s∗)|X(s) ∼ N(b′X(s), v2) Observe ˜ X(s) = X(s) + ǫ where ǫ ⊥ X(s). Still a regression problem: X(s∗)| ˜ X(s) ∼ N(˜ b′ ˜ X(s), ˜ v2) Berkson errors: ǫ ⊥ ˜ X(s) (not satisfied) Is i.i.d. error in s just i.i.d. error in X(s)? X(s + u) = X(s) + ǫ

Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-28
SLIDE 28

GP interpolation with location error

Errors in variables: X(s∗)|X(s) ∼ N(b′X(s), v2) Observe ˜ X(s) = X(s) + ǫ where ǫ ⊥ X(s). Still a regression problem: X(s∗)| ˜ X(s) ∼ N(˜ b′ ˜ X(s), ˜ v2) Berkson errors: ǫ ⊥ ˜ X(s) (not satisfied) Is i.i.d. error in s just i.i.d. error in X(s)? X(s + u) = X(s) + ǫ ǫ ∼ N((Ku,sK−1

s,s − I)X(s), Ku,u − Ku,sK−1 s,s Ks,u) where

Ku,s = K(s + u, s), etc. ǫ ⊥ X(s) and ǫ ⊥ X(s + u)

Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-29
SLIDE 29

GP interpolation with location error

Illustration

i.i.d. additive error for measurements X

2 4 6 8 10 −10 −5 5 location value

Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-30
SLIDE 30

GP interpolation with location error

Illustration

i.i.d. additive error for measurements X

2 4 6 8 10 −10 −5 5 location value

Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-31
SLIDE 31

GP interpolation with location error

Illustration

i.i.d. additive error for measurements X

2 4 6 8 10 −10 −5 5 location value

Usual GP/kriging regime

Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-32
SLIDE 32

GP interpolation with location error

Illustration

i.i.d. additive error for locations s

2 4 6 8 10 −10 −5 5 location value

Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-33
SLIDE 33

GP interpolation with location error

Illustration

i.i.d. additive error for locations s

2 4 6 8 10 −10 −5 5 location value

Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-34
SLIDE 34

Kriging

Cressie & Kornak 2003; Fanshawe & Diggle 2010

Location errors induce (non-Gaussian) process by convolution: Xg(s) = X(s + u), u ∼ g(u)

Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-35
SLIDE 35

Kriging

Cressie & Kornak 2003; Fanshawe & Diggle 2010

Location errors induce (non-Gaussian) process by convolution: Xg(s) = X(s + u), u ∼ g(u) Xg(s) is mean 0 with covariance function: Kg(s, s) =

  • K(s + u, s + u)g(u), or Kg(s, s∗) =
  • K(s + u, s∗)g(u)

Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-36
SLIDE 36

Kriging

Cressie & Kornak 2003; Fanshawe & Diggle 2010

Location errors induce (non-Gaussian) process by convolution: Xg(s) = X(s + u), u ∼ g(u) Xg(s) is mean 0 with covariance function: Kg(s, s) =

  • K(s + u, s + u)g(u), or Kg(s, s∗) =
  • K(s + u, s∗)g(u)

Kriging gives BLUP for X(s∗) given Xg(s). Typically Kg(s, s) evaluated by Monte Carlo.

Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-37
SLIDE 37

Kriging

With measurement error in locations, the BLUP: Inadmissible under squared error loss First two moments give invalid interval coverage Requires Monte Carlo, generally O(n3)

Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-38
SLIDE 38

Kriging

With measurement error in locations, the BLUP: Inadmissible under squared error loss First two moments give invalid interval coverage Requires Monte Carlo, generally O(n3) We should use the BnLUP E[X(s∗)|Xg(s)]! Dominates BLUP First two moments give valid interval coverage Easily implemented with HMC, generally O(n3) Easily extends to inference for parameters

Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-39
SLIDE 39

Simulation

Compare interpolation using BLUP, BnLUP, and no adjustment for location measurement error. s = {0, 1, . . . , 4, 6, . . . , 10} and s∗ = {5, 11} u ∼ Unif(−θu, θu). Other combinations of all other parameters.

Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-40
SLIDE 40

Simulation

Compare interpolation using BLUP, BnLUP, and no adjustment for location measurement error. s = {0, 1, . . . , 4, 6, . . . , 10} and s∗ = {5, 11} u ∼ Unif(−θu, θu). Other combinations of all other parameters. Two cases of particular interest: Strong signal: high autocorrelation (γ small) and small “nugget” variance σ2 Weak signal: low autocorrelation (γ large) or large “nugget” variance σ2 All parameters fixed and known in simulations.

Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-41
SLIDE 41

Simulation results

Mean squared prediction error vs. oracle predictor

0.0 0.5 1.0 1.5 2.0

Interpolation for strong signal

θ_u relative MSPE 0.1 0.5 1 2 0.0 0.5 1.0 1.5 2.0

Extrapolation for strong signal

θ_u relative MSPE 0.1 0.5 1 2

Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-42
SLIDE 42

Simulation results

Mean squared prediction error vs. oracle predictor

0.0 0.5 1.0 1.5 2.0

Interpolation for strong signal

θ_u relative MSPE 0.1 0.5 1 2 0.0 0.5 1.0 1.5 2.0

Extrapolation for strong signal

θ_u relative MSPE 0.1 0.5 1 2 0.0 0.5 1.0 1.5 2.0

Interpolation for weak signal

θ_u relative MSPE 0.1 0.5 1 2 0.0 0.5 1.0 1.5 2.0

Extrapolation for weak signal

θ_u relative MSPE 0.1 0.5 1 2

Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-43
SLIDE 43

Simulation results

95% Interval coverage

0.2 0.4 0.6 0.8 1.0

Interpolation for strong signal

θ_u 95% interval covarage 0.1 0.5 1 2 0.2 0.4 0.6 0.8 1.0

Extrapolation for strong signal

θ_u 95% interval covarage 0.1 0.5 1 2

Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-44
SLIDE 44

Simulation results

95% Interval coverage

0.2 0.4 0.6 0.8 1.0

Interpolation for strong signal

θ_u 95% interval covarage 0.1 0.5 1 2 0.2 0.4 0.6 0.8 1.0

Extrapolation for strong signal

θ_u 95% interval covarage 0.1 0.5 1 2 0.2 0.4 0.6 0.8 1.0

Interpolation for weak signal

θ_u 95% interval covarage 0.1 0.5 1 2 0.2 0.4 0.6 0.8 1.0

Extrapolation for weak signal

θ_u 95% interval covarage 0.1 0.5 1 2

Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-45
SLIDE 45

Next steps

Application to climate data: Location errors in covariates referencing Large-scale model Influence of extreme value estimation

Dan Cervone () STAT 300: Research in Statistics April 15, 2014

slide-46
SLIDE 46

Next steps

Application to climate data: Location errors in covariates referencing Large-scale model Influence of extreme value estimation Also: (Stochastic) EM implementation BnLUP without normality assumption?

Dan Cervone () STAT 300: Research in Statistics April 15, 2014