stochastic solution of large least squares systems in
play

Stochastic solution of large least squares systems in variational - PowerPoint PPT Presentation

Stochastic solution of large least squares systems in variational data assimilation Parallel 4DVAR without tangents and adjoints J. Mandel, S. Gratton, and E. Bergou University of Colorado Denver Institute of Computer Science, Czech Academy of


  1. Stochastic solution of large least squares systems in variational data assimilation Parallel 4DVAR without tangents and adjoints J. Mandel, S. Gratton, and E. Bergou University of Colorado Denver Institute of Computer Science, Czech Academy of Sciences INP-ENSEEIHT and CERFACS, Toulouse, France Supported by Fondation STAE project ADTAO NSF grants AGS-0835579 and DMS-1216481 Czech Grant Agency project 13-34856S Preconditioned Iterative Methods, Prague 2013

  2. Data assimilation ◮ Standard computing: data in, results out ◮ Not much changed in principle since we were taking punched cards and magnetic tapes to the mainframe ◮ Data assimilation: combine data with a running model ◮ Cast as Bayesian update or least squares ◮ Used already in Apollo 11 navigation, now in GPS, standard in numerical weather prediction

  3. Weak Constraint 4DVAR ◮ We want to determine x 0 , . . . , x k ( x i = state at time i ) approximately from model and observations (data) ≈ state at time 0 ≈ the background x 0 x b ≈ M i ( x i − 1 ) state evolution ≈ by the model x i H i ( x i ) ≈ value of observation operator ≈ the data y i ◮ quantify “ ≈ ” by covariances B − 1 = ( x 0 − x b ) T B − 1 ( x 0 − x b ) ≈ 0 etc. x 0 ≈ x b ⇔ � x 0 − x b � 2 ◮ ⇒ nonlinear least squares problem k k � x 0 − x b � 2 � x i − M i ( x i − 1 ) � 2 � y i − H i ( x i ) � 2 ∑ ∑ B − 1 + + → min Q − 1 R − 1 x 0 : k i i i = 1 i = 1 .

  4. Incremental 4DVAR ◮ linearization at [ x 0 , . . . , x k ] = x 0 : k M i ( x i − 1 + z i − 1 ) ≈ M i ( x i − 1 ) + M ′ i ( x i − 1 ) z i − 1 H i ( x i + z i ) ≈ H i ( x i ) + H ′ i ( x i ) z i ◮ Gauss-Newton method (Bell, 1994; Courtier et al., 1994) consists of the iterations x 0 : k ← x 0 : k + z 0 : k with the increments z 0 : k solving the linear least squares problem k � 2 � x 0 + z 0 − x b � 2 � x i + z i − M i ( x i − 1 ) − M ′ ∑ � � B − 1 + i ( x i − 1 ) z i − 1 Q − 1 i i = 1 k � 2 � y i − H i ( x i ) − H ′ ∑ � � + i ( x i ) z i → min R − 1 z 0 : k i i = 1

  5. Linearized 4DVAR as Kalman smoother ◮ Write the linear least squares problem for the increments z 0 : k as k k � z 0 − z b � 2 � z i − M i z i − 1 − m i � 2 � d i − H i z i � 2 ∑ ∑ B − 1 + + → min Q − 1 R − 1 z 0 : k i i i = 1 i = 1 ◮ Theorem The least squares solution z 0 : k is the mean of the random vectors Z 0 : k satisfying the stochastic equations Z 0 = z b + V 0 , V 0 ∼ N ( 0, B ) Z i = M i Z i − 1 + m i + V i , V i ∼ N ( 0, Q i ) , i = 1, . . . , k d i = H i Z i + W i , W i ∼ N ( 0, R i ) i = 1, . . . , k ◮ This is the Kalman smoother (Rauch et al., 1965; Bell, 1994). ◮ Simulate by Monte-Carlo = ⇒ Ensemble Kalman Smoother (EnKS) (Evensen and van Leeuwen, 2000).

  6. Kalman smoother ◮ To solve the stochastic Kalman smoother equations (Rauch et al., 1965; Bell, 1994) Z 0 = z b + V 0 , V 0 ∼ N ( 0, B ) Z i = M i Z i − 1 + m i + V i , V i ∼ N ( 0, Q i ) , i = 1, . . . , k d i = H i Z i + W i , W i ∼ N ( 0, R i ) i = 1, . . . , k ◮ Build the solution recursively: Z i | j incorporates the data d 1 : j only, need Z i = Z i | k ◮ Kalman filter is the recursion Z i − 1 | i − 1 �→ Z i | i . ◮ Kalman smoother applies the same update to previous times to get Z 0 : i | i − 1 �→ Z 0 : i | i ◮ Eventually, we obtain Z 0 : k as Z 0 : k | k

  7. Kalman filter and smoother recursions ◮ Recall notation: Z step number | data incorporated up to ◮ Kalman filter incorporates data d i in the latest state: data d i Z i | i − 1 → Z i | i − M 1 data d 1 M 2 data d 2 Z 0 | 0 Z 1 | 0 Z 1 | 1 Z 2 | 1 Z 2 | 2 − → − → − → − → ◮ Kalman smoother incorporates data d i also in all previous states data d i Z 0 : i | i − 1 → Z 0 : i | i − � → � Z 0 | 0 � Z 0 | 1  Z 0 | 1   Z 0 | 2  → � d 1 d 2 → Z 0 | 0 Z 1 | 1 Z 1 | 2 M 1 − →   − →   Z 1 | 0 Z 1 | 1 M 2 − → Z 2 | 1 Z 2 | 2 − →

  8. Ensemble Kalman filter (EnKF) Represent random vector Z i | j by an ensemble of realizations � � z 1 Z N i | j , . . . , z N i | j = i | j 1. Initialize by sampling background distribution z ℓ 0 | 0 ∼ N ( z b , B ) , ℓ = 1, . . . , N . 2. For i = 1, . . . , k : 2.1 Advance in time and randomize by sampling model error z ℓ i | i − 1 = M i z ℓ i − 1 | i − 1 + m i + v ℓ v ℓ i ∼ N ( 0, Q i ) i , 2.2 Incorporate data d i z ℓ i | i = z ℓ i | i − 1 − P N i H T i ( H i P N i H T i + R i ) − 1 · ( H i z ℓ i | i − 1 − d i − w ℓ w ℓ i ) , i ∼ N ( 0, R i ) , where P N is the sample covariance from the ensemble Z N i i | i − 1 � � Approximate solution: the ensemble mean: z i | i ≈ 1 z 1 i | i + · · · + z N N i | i

  9. Ensemble Kalman smoother (EnKS) ◮ Run the Ensemble Kalman filter (EnKF). Turns out incorporating the data d i replaces the ensemble by linear combinations (transformation by a T N i ): Z N i | i = Z N i | i − 1 T N T N i ∈ R N × N . i , ◮ EnKS = EnKF + transform the history exactly the same way : Z N 0 : i | i = Z N 0 : i | i − 1 T N i . � � Approximate solution: the ensemble mean: z i | k ≈ 1 z 1 i | k + · · · + z N N i | k

  10. Convergence of the EnKF and EnKS Definition (Stochastic L p norm) For random vector X : Ω → R n , � | X | p � 1 / p � X � p = E Theorem (Mandel et al. (2011); Le Gland et al. (2011)) z ℓ i | i → Z i | i in all L p , 1 ≤ p < ∞ , as the ensemble size N → ∞ Corollary The EnKF ensemble mean � � � � 1 z 1 i | i + · · · + z N in all L p , 1 ≤ p < ∞ , as N → ∞ → z i | i = E Z i | i N i | i Corollary The EnKS ensemble mean converges to the solution of the linear least squares problem: 1 � � z 1 i | k + · · · + z N → z i = z i | k = E ( Z i | k ) i | k N in all L p , 1 ≤ p < ∞ , as N → ∞ .

  11. Model operator The linearized model M i = M ′ i ( x i − 1 ) occurs only in advancing the time as the action on the ensemble of the increments � � x i − 1 + z ℓ − x i ≈ M i ( x i − 1 ) + M ′ i ( x i − 1 ) z ℓ i − 1 − x i = M i z ℓ M i i − 1 + m i i − 1 Approximating by finite differences with a parameter τ > 0: � � x i − 1 + τ z ℓ M i − M i ( x i − 1 ) i − 1 M i z ℓ i − 1 + m i ≈ + M i ( x i − 1 ) − x i τ Needs N + 1 evaluations of M i , at x i − 1 and x i − 1 + τ z ℓ i − 1 , ℓ = 1, . . . , N Accurate in the limit τ → 0. � � x i − 1 + z ℓ For τ = 1, recover the nonlinear model: M i − x i i − 1

  12. Observation operator The observation matrix occurs only in the action on the ensemble, H i Z N = � H i z 1 , . . . , H i z N � . Approximating by finite differences with a parameter τ > 0: � � x i + τ z ℓ H i − H i ( x i ) i H i z ℓ i − 1 ≈ τ Needs N + 1 evaluations of H i , at x i − 1 and x i − 1 + τ z n i − 1 . Accurate in the limit τ → 0. τ = 1 ⇒ becomes exactly the EnKS run on the nonlinear problem ⇒ EnKS independent of the point of linearization ⇒ no convergence of the 4DVAR iterations. In our tests, τ = 0.1 seems to work well enough.

  13. Tikhonov regularization and Levenberg-Marquardt method ◮ Gauss-Newton may not converge, even locally. Add a penalty (Tikhonov regularization) to control the size of the increments z i : k k � z 0 − z b � 2 � z i − M i z i − 1 − m i � 2 � d i − H i z i � 2 ∑ ∑ B − 1 + + Q − 1 R − 1 i i i = 1 i = 1 k � z i � 2 ∑ + γ → min S − 1 z 0 : k i i = 0 ◮ Becomes the Levenberg-Marquardt method, which is guaranteed to converge for large enough γ . ◮ Implement the regularization as independent observations z i ≈ 0 with error covariance S i : simply run the analysis step the second time (Johns and Mandel, 2008). Here, this is statistically exact because the distributions in the Kalman smoother are gaussian.

  14. Convergence Theorem . Every iteration of the Levenberg-Marquardt or the Gauss-Newton method with the stochastic linear solver by the EnKS converges in all L p , 1 ≤ p < ∞ as N → ∞ , and τ → 0 , to the same iteration with the exact linear solver.

  15. Functional interface No matrices needed explicitly. Only certain matrix-vector operations. ◮ Evaluation of the model operator M i ( x i − 1 ) for a given state x i − 1 . ◮ Evaluation of the observation operator H i ( x i ) for a given state x i . ◮ Multiplication of a vector u 0 by square root V of the background covariance B : B = VV T . Similarly, multiplication of a vector u i by a square roots of the model error covariance Q i , and a data-dimensioned vector v i by a square root of the data error covariance R i .These matrix roots are used only to generate samples from N ( 0, B ) , N ( 0, Q i ) , and N ( 0, R i ) . ◮ Solving a system with matrix R i .

  16. Computational results Lorenz 63 model dx dt = − σ ( x − y ) dy dt = ρ x − y − xz dz dt = xy − β z

  17. EnKS-4DVAR for Lorenz 63 model Twin experiment - one solution chosen as the truth � 1, 1 4, 1 � � x 2 , y 2 , z 2 � B = σ 2 R i = σ 2 H i ( x , y , z ) = b diag , r I , . 9 Root mean square error of EnKS-4DVAR iterations over 50 timesteps, ensemble size 100, model error covariances Q i ≈ 0. Iteration 1 2 3 4 5 6 RMSE 20.16 15.37 3.73 2.53 0.09 0.09

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend