spatio temporal smoothing of co 2 retrievals
play

Spatio-Temporal Smoothing of CO 2 Retrievals Noel Cressie Program - PowerPoint PPT Presentation

Spatio-Temporal Smoothing of CO 2 Retrievals Noel Cressie Program in Spatial Statistics and Environmental Statistics Department of Statistics, The Ohio State University Matthias Katzfu ur Angewandte Mathematik, Universit Institut f at


  1. Spatio-Temporal Smoothing of CO 2 Retrievals Noel Cressie Program in Spatial Statistics and Environmental Statistics Department of Statistics, The Ohio State University Matthias Katzfuß ur Angewandte Mathematik, Universit¨ Institut f¨ at Heidelberg Acknowledgment: AIST grant from NASA’s ESTO (PI: Amy Braverman) – p.1/36

  2. AIRS CO 2 Data Global satellite measurements of CO 2 from the AIRS instrument (data below 60 ◦ S are not available) Challenges of global remote-sensing data: Massiveness Need dimension reduction Sparseness Need to take advantage of spatial and temporal correlations Nonstationarity Need a flexible model Goal : Using AIRS CO 2 data, map CO 2 globally after “de-noising” and “gap-filling”; also map uncertainty – p.2/36

  3. Hierarchical Spatio-Temporal Model Time is discrete Write the hidden spatio-temporal process as Y t ( s ) at time t and location s The true process is observed imperfectly and incompletely Data Model : Z t ( s ) = Y t ( s ) + ε t ( s ) ε t ( · ) ∼ Gau (0 , σ 2 ε,t v ε ( · )) , is measurement error that is assumed uncorrelated across time and space σ 2 ε,t and v ε ( · ) are known Goal: For any s 0 on the globe, estimate Y t ( s 0 ); t ∈ { 1 , . . . , T } , after observing { Z t ( s i,t ) } ; define Z 1: T ≡ ( Z ′ 1 , . . . , Z ′ T ) ′ , where Z t ≡ ( Z ( s 1 ,t ) , . . . , Z ( s n t ,t )) ′ Consider a time series of spatial processes (e.g., Cressie and Wikle, 2011). Using a state space model, we smooth the data (cf. filtering ) – p.3/36

  4. Hierarchical Spatio-Temporal Model, ctd. Process Model : Do not use a transport model. Take a secular approach and use a dynamical statistical model; let the data speak. Model the spatio-temporal variability statistically: Y t ( s ) = x t ( s ) ′ β t + S t ( s ) ′ η t + ξ t ( s ) The dimension of η t is r < < n t (dimension reduction; e.g., n t = 12515 , r = 380 ). – p.4/36

  5. Hierarchical Spatio-Temporal Model, ctd. In vector notation , Z t = Y t + ε t Y t = X t β t + S t η t + ξ t Σ t ≡ var ( Z t ) = S t K t S ′ t + D t , where K t ≡ var ( η t ) D t ≡ σ 2 ξ,t V ξ,t + σ 2 ε,t V ε,t , is a diagonal n t × n t matrix Temporal dynamics on r -dimensional space η t = H η t − 1 + δ t U = var ( δ t ) HK t +1 H ′ + U K t = The part of the model for Y t given by S t η t + ξ t , is called a Spatio-Temporal Random Effects (STRE) model Goal: Estimate Y t from Z 1: T – p.5/36

  6. Fixed Rank Smoothing (FRS) For the moment, assume parameters θ are known An extension of the Kalman smoother gives, for t ∈ { 1 , . . . , T } , η t | T ≡ E ( η t | Z 1: T ) , P t | T ≡ var ( η t | Z 1: T ) , P t,t − 1 | T ≡ cov ( η t , η t − 1 | Z 1: T ) ξ t | T ( s 0 ) ≡ E ( ξ t ( s 0 ) | Z 1: T ) , c t | T ( s 0 ) ≡ var ( ξ t ( s 0 ) | Z 1: T ) , R t | T ( s 0 ) ≡ cov ( η t , ξ t ( s 0 ) | Z 1: T ) Then, FRS yields (Cressie, Shi, & Kang, 2010) the estimate of Y t ( s 0 ) . For t = 1 , . . . , T : Y t ( s 0 ) = x t ( s 0 ) ′ β t + S ( s 0 ) ′ η t | T + ξ t | T ( s 0 ) , ˆ and the mean squared prediction error (MSPE), E ( ˆ Y t ( s 0 ) − Y t ( s 0 )) 2 , can also be calculated. Our measure of uncertainty is ( MSPE ) 1 / 2 Rapid computation : Need to invert the very large n t × n t covariance matrix Σ t ; only inversion of r × r matrices and n t × n t diagonal matrices are required. Use the Sherman-Morrison-Woodbury identity , over and over: ( I + PKP ′ ) − 1 = I − P ( K − 1 + P ′ P ) − 1 P ′ – p.6/36

  7. Parameters The parameters, θ , consist of { β t } , { σ 2 ξ,t } , and the elements of K 0 , H , and U In FRS, θ was assumed known. In this presentation, we take an empirical-Bayes approach and “plug in” estimates of the components of θ . Ideally, estimate θ by maximum likelihood estimation (MLE) ; see Katzfuss and Cressie (2011a) A fully Bayes approach has also been developed (Katzfuss and Cressie, 2011b) – p.7/36

  8. Empirical Bayes: Estimate the Parameters Define innovations , α t ≡ Z t − X t β t − S t E ( η t | Z 1: t − 1 ) ; t = 1 , . . . , T ind. ∼ Gau ( 0 , Σ α t ) , α t where Σ α t ≡ S t var ( η t | Z 1: t − 1 ) S ′ t + D t ; t = 1 , . . . , T Likelihood (Shumway & Stoffer, 2006): − 2 log L ( θ ) ≡ − 2 f ( α 1 , . . . , α T | θ ) T T X X α t ( θ ) ′ Σ α t ( θ ) − 1 α t ( θ ) = log | Σ α t ( θ ) | + t =1 t =1 Finding MLEs analytically is intractable , even for T=1 (Katzfuss and Cressie, 2009) – p.8/36

  9. EM Algorithm: Review Suppose we observe X obs ∼ f ( x obs | θ ) , and we are interested in finding the MLE of θ If maximizing L ( θ ) ≡ f ( x obs | θ ) is hard, but there are missing data x mis such that maximizing the complete likelihood , L c ( θ ) ≡ f ( x obs , x mis | θ ) , is easy, we can use the EM algorithm to find the MLE of θ The EM Algorithm (Dempster, Laird, and Rubin, 1977): Starting with θ [0] = θ 0 , repeat the following two steps until convergence: E-Step: Calculate Q ( θ ; θ [ l ] ) ≡ E θ [ l ] { log f ( x obs , X mis | θ ) | x obs } M-Step: Calculate θ [ l +1] = arg max Q ( θ ; θ [ l ] ) θ – p.9/36

  10. The EM Algorithm in this Context (Katzfuss & Cressie, 2011a) Reminder: Z t = X t β t + S t η t + ξ t + ε t Missing (mis) “data” : { η t : t = 0 , 1 , . . . , T } and { ξ t : t = 1 , . . . , T } − 2 log L c ( θ ) = − 2 log f ( η 1: T , Z 1: T , ξ 1: T | θ ) = log | K 0 | + tr ( K − 1 η 0 η ′ 0 ) 0 T T 1 X n t log σ 2 X tr ( V − 1 ξ,t ξ t ξ ′ + ξ,t + t ) σ 2 ξ,t t =1 t =1 T T X X tr { U − 1 ( η t − H η t − 1 )( η t − H η t − 1 ) ′ } + log | U | + t =1 t =1 T 1 tr { V − 1 X ε,t ( Z t − X t β t − S t η t − ξ t )( Z t − X t β t − S t η t − ξ t ) ′ } + σ 2 ε,t t =1 E-Step : FRS provides conditional expectations and covariances of η t and ξ t at θ [ l ] , given Z 1: T M-Step : Taking partial derivatives of Q ( θ ; θ [ l ] ) is easy due to terms being additive – p.10/36

  11. Our EM Algorithm Choose initial value θ [0] in the parameter space Θ While || θ [ l +1] − θ [ l ] || > δ : 1. Run FRS with θ [ l ] to obtain η [ l ] t | T , P [ l ] t | T , P [ l ] t,t − 1 | T , ξ [ l ] t | T , and C [ l ] t | T ≡ diag { c t | T ( s 1 ) , . . . , c t | T ( s n ) } 2. Calculate θ [ l +1] as β [ l +1] ε,t [ Z t − S t η [ l ] t | T − ξ [ l ] t V − 1 t V − 1 = ( X ′ ε,t X t ) − 1 X ′ t | T ] t [ l +1] = tr ( V − 1 ξ,t [ C [ l ] t | T + ξ [ l ] t | T ξ [ l ] σ 2 ′ ]) /n t ξ,t t | T K [ l +1] = P [ l ] t | T + η [ l ] t | T η [ l ] ′ t t | T L [ l +1] = P [ l ] t,t − 1 | T + η [ l ] t | T η [ l ] ′ t t − 1 | T H [ l +1] = ( P T t =1 L [ l +1] t =0 K [ l +1] )( P T − 1 ) − 1 t t U [ l +1] = ( P T t =1 K [ l +1] t =1 L [ l +1] − H [ l +1] P T ′ ) / ( T − 1) t t 3. Repeat – p.11/36

  12. Properties of the EM Estimators, ˆ θ EM The first θ [ l ] that meets the convergence criterion is called ˆ θ EM If θ [0] ∈ Θ , then θ [ l ] ∈ Θ , for all l = 1 , 2 , . . . Because Q ( θ ; θ [ l ] ) is continuous in both arguments, ˆ θ EM is generally a solution to the likelihood equations (Wu, 1983) For T=1, this algorithm is equivalent to the SRE (spatial-only) EM algorithm (Katzfuss and Cressie, 2009) used in Fixed Rank Kriging (FRK) The computational cost of the algorithm is linear in each n t – p.12/36

  13. Summary: EM-FRS Run EM on the observed data to obtain EM parameter estimates, ˆ θ EM Do FRS with EM estimates plugged in to obtain ˆ Y t ( s 0 ) and its corresponding root mean squared prediction error at each location s 0 ; for example, EM + S ( s 0 ) ′ η EM Y t ( s 0 ) EM ≡ x t ( s 0 ) ′ ˆ ˆ t | T + ξ t | T ( s 0 ) EM β – p.13/36

  14. AIRS CO 2 Data Measurements are taken by the Atmospheric Infrared Sounder ( AIRS ) instrument, on NASA’s Aqua satellite Mid-tropospheric CO 2 at roughly 1.30pm local time on May 1-16, 2003. Data are considered daily ( t = 1 , . . . , 16 ) Global coverage: Between − 60 ◦ and 90 ◦ latitude. (Retrievals are not available below 60 ◦ S) n 1 = 12515 , n 2 = 12959 , n 3 = 12971 , n 4 = 12495 , n 5 = 11862 , n 6 = 12317 , n 7 = 12577 , n 8 = 12527 , n 9 = 12651 , n 10 = 12252 , n 11 = 12681 , n 12 = 12076 , n 13 = 12245 , n 14 = 12447 , n 15 = 12372 , n 16 = 12532 We map our estimates onto a (hexagonal) grid of size m t = 57 , 065 for each day t The measurement unit is parts per million (ppm) – p.14/36

  15. AIRS CO 2 Data, ctd Mid-tropospheric CO 2 on May 1-4 (for example), 2003, as measured by AIRS (gridded) – p.15/36

  16. Assumptions and Specifications Level 2 data were moved onto a fine regular hexagonal grid of approximately the resolution of the AIRS footprint. On some occasions, more than one Level 2 datum was in a grid cell: The datum Z t ( s i ) was obtained by averaging all Level 2 data in grid cell i on day t . Variances of measurement error and fine-scale variation: σ 2 ε ≡ σ 2 ε, 1 = . . . = σ 2 ε, 16 = 5 . 6 (estimated from the variogram) σ 2 ξ ≡ σ 2 ξ, 1 = . . . = σ 2 ξ, 16 v ε,t ( s i ) is equal to the inverse of the number of Level 2 data that were averaged over the i -th (hexagonal) grid cell on day t Basis functions: r = 380 bisquare functions at three spatial resolutions, the same for all t = 1 , . . . , 16 . The core basis function is the bisquare function: b ( u ) = (1 − � u − s � 2 ) 2 I ( � u − s � ≤ 1) Trend: x ( s ) = (1 , lat ( s )) ′ Make maps on a hexagonal grid of size m t = 57 , 065 for each day t = 1 , . . . , 16 – p.16/36

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend