data driven stochastic model reduction
play

Data-driven stochastic model reduction Fei Lu joint with Alexandre - PowerPoint PPT Presentation

Data-driven stochastic model reduction Fei Lu joint with Alexandre J. Chorin and Kevin Lin UC Berkeley Lawrence Berkeley National Laboratory U of Arizona July 18, 2016 ICERM, Brown University 1 / 22 Outline Part I: Motivation for


  1. Data-driven stochastic model reduction Fei Lu joint with Alexandre J. Chorin and Kevin Lin UC Berkeley Lawrence Berkeley National Laboratory U of Arizona July 18, 2016 ICERM, Brown University 1 / 22

  2. Outline Part I: Motivation for stochastic model reduction Part II: A discrete-time approach Part III: Examples 2 / 22

  3. Motivation Climate modeling, numerical weather prediction: High-dimensional Discrete partial Prediction Full system data d Forecast dt x = f ( x ) + U ( x , y ) , Observe only { x ( n δ ) } N n = 1 . x ( t ) , t ≥ N δ . d dt y = g ( x , y ) . Possible approaches: Estimate y , and solve the full system ◮ The full system is expensive to solve ◮ The full system may be unknown Construct a reduced model for x 3 / 22

  4. Why stochastic models Data { x ( n δ ) } ⇒ properties of the full system: long-term statistics (ergodicity) N � 1 F ( x ( n i δ )) → E [ F ( x )] N i = 1 positive Lyapunov exponents the effects of y act like random forces Goal: develop a stochastic reduced system for x that can reproduce long-term statistics; make medium range predictions. 4 / 22

  5. Outline Part I: Motivation for stochastic model reduction Part II: A discrete-time approach Previous work: continuous-time approach Discrete-time stochastic parametrization Part III: Examples 5 / 22

  6. The Mori-Zwanzig Formalism ( Mori (65) and Zwanzig (73) ) Rewrite the deterministic system in a form which resembles a generalized Langevin equation ( Chorin-Hald (06) ): Z t dX t dt = f ( X t ) + Γ( t − s , X s ) ds + W t 0 Markovian memory noise Chorin et al (00), Zwanzig(01), Chorin-Hald(06) equation + a projection operator + Duhamel’ Message: memory exists in a closed system for x . Multi-level regression Atmospheric sciences: Kondrashov et al (05-15), Wilks (05) Hypoelliptic SDEs ( Majda-Harlim (13,14) ) dX t = f ( X t ) dt + V t dt , dV t = a ( X t , V t ) dt + σ dW t . 6 / 22

  7. Continuous-time approach: Discrete partial data dX t = f ( X t ) dt + V t dt , parameter estimation structure dV t = a ( X t , V t ) dt + σ dW t . Prediction derivation A continuous-time A discrete-time stochastic system stochastic system discretization 7 / 22

  8. Continuous-time approach: Discrete partial data dX t = f ( X t ) dt + V t dt , parameter estimation structure dV t = a ( X t , V t ) dt + σ dW t . Prediction derivation A continuous-time A discrete-time stochastic system stochastic system discretization Discrete-time approach: Discrete partial data X n = Φ( X n − p : n − 1 , ξ n − q : n − 1 ) + ξ n parameter structure estimation derivation Prediction A discrete-time stochastic system 8 / 22

  9. Discrete-time stochastic parametrization Identify a discrete-time stochastic system: NARMA ( Chorin-Lu (15) ) X n = X n − 1 + δ R δ ( X n − 1 ) + δ Z n , Z n = Φ n + ξ n , p q r s � � � � Φ n = a j Z n − j + b i , j P i ( X n − j ) + c j ξ n − j ; j = 1 j = 1 i = 1 j = 1 R δ ( X n − 1 ) from a discrete scheme for x ′ ≈ f ( x ) ; Φ n = Φ( Z n − p : n − 1 , X n − r : n − 1 , ξ n − q : n − 1 ) depends on the past; ξ n are i.i.d N ( 0 , σ 2 ) ; a Nonlinear AutoRegression Moving Average model (NARMA). Structure: terms in Φ n ; Parameters: a j , b j , c j , and σ . 9 / 22

  10. X n = X n − 1 + δ R δ ( X n − 1 ) + δ Z n , Z n = Φ n + ξ n , p r s q � � � � Φ n = a j Z n − j + b i , j P i ( X n − j ) + c j ξ n − j ; j = 1 j = 1 i = 1 j = 1 Parameter estimation: conditional maximum likelihood estimator (MLE) Conditional on ξ 1 , . . . , ξ q , the log-likelihood of X q + 1 : N is l ( θ | ξ 1 , . . . , ξ q ) = − � N | Z n − Φ n | 2 − N − q ln σ 2 . n = q + 1 2 σ 2 2 when q = 0, MLE = least squares estimator (LSE) when q > 0, set ξ 1 = · · · = ξ q = 0 ◮ ergodicity ⇒ MLE is consistent ⇒ independent of ξ 1 , . . . , ξ q Structure derivation/selection: stability, long-term statistics numerical schemes analytical properties of the full system 10 / 22

  11. Outline Part I: Motivation for stochastic model reduction Part II: A discrete-time approach Part III: Examples: The Lorenz 96 system The Kuramoto-Sivashinsky equation 11 / 22

  12. Example I: the Lorenz 96 system � dt x k = x k − 1 ( x k + 1 − x k − 2 ) − x k + 10 − 1 d y j , k , J j dt y j , k = 1 d ε [ y j + 1 , k ( y j − 1 , k − y j + 2 , k ) − y j , k + x k ] , where x ∈ R K , y ∈ R JK . a chaotic system K = 18 , J = 20: y ∈ R 360 Find a reduced system for x ∈ R 18 based on ➣ Data { x ( n δ ) } N n = 1 ➣ d dt x k ≈ x k − 1 ( x k + 1 − x k − 2 ) − x k + 10. Wilks, 2005 12 / 22

  13. � NARMA [Chorin-Lu (15)] : x n − 1 + δ R δ ( x n − 1 ) + δ z n ; z n = Φ n + ξ n , x n = p r d x � � � a j z n − j + b j , l ( x n − j ) l Φ n = µ + j = 1 j = 1 l = 1 d R q s � � � c j , l ( R δ ( x n − j )) l + d j ξ n − j . + j = 1 l = 1 j = 1 � Polynomial autoregression (POLYAR) [Wilks (05)]: d = x k − 1 ( x k + 1 − x k − 2 ) − x k + 10 + U , dt x k U = P ( x k ) + η k , with η k ( t + δ ) = φη k ( t ) + ξ ( t ) where P ( x ) = � d j = 0 a j x j ; ξ ( t ) ∼ N ( 0 , σ 2 ) . 13 / 22

  14. Long-term statistics run the reduced systems for a long time; compare with the data: 0.14 1 L96 L96 POLYAR POLYAR NARMAX NARMAX 0.8 0.12 0.6 0.1 0.4 0.08 pdf ACF 0.2 0.06 0 0.04 − 0.2 0.02 − 0.4 0 − 0.6 − 10 − 5 0 5 10 15 0 1 2 3 4 5 6 7 8 9 10 x 1 time empirical probability density function autocorrelation function 14 / 22

  15. Short-term prediction A typical ensemble forecast: RMSE of many forecasts: 20 10 POLYAR 18 5 X 1 16 POLYAR 0 14 − 5 12 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 time RMSE 10 NARMAX 8 10 NARMAX 6 5 X 1 4 0 2 − 5 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 lead time time Forecast time: forecast trajectories in cyan POLYAR: t ≈ 1 true trajectory in blue NARMA: t ≈ 2 . 5 15 / 22

  16. Example II: the Kuramoto-Sivashinsky equation The KSE with periodic BC: v ( x , t ) = v ( x + 2 πν, t ) ∂ t + ∂ 2 v ∂ x 2 + ∂ 4 v ∂ v ∂ x 4 + v ∂ v ∂ x = 0 , x ∈ R , t > 0 . spatio-temporally chaotic: Fourier: v ( x , t ) = � + ∞ k = −∞ ˆ v k e iq k x , (here q k = k ν ) ∞ d v k − iq k v k = ( q 2 k − q 4 � dt ˆ k )ˆ ˆ v l ˆ v k − l . 2 l = −∞ Problem setting: ν = 3 . 43 � observe only K = 5 modes, (ˆ v k , k = 1 , . . . , K ) � predict their evolution 16 / 22

  17. The truncated system is not accurate: d v k − iq k v k = ( q 2 k − q 4 � dt ˆ k )ˆ ˆ v l ˆ | k | ≤ K . v k − l , 2 | l |≤ K , | k − l |≤ K NARMA compute R δ ( X n − 1 ) ( from the K -mode X n = X n − 1 + δ R δ ( X n − 1 ) + δ Z n , truncated system) Z n = Φ n + ξ n , structure for Φ n : p q � � a j Z n − j + c j ξ n − j Φ n = ◮ different modes have different j = 1 j = 1 dynamics r s � � b i , j P i ( X n − j ) . + ◮ nonlinear interaction between j = 1 i = 1 modes 17 / 22

  18. Large time behavior of the KSE ( Constantin, Jolly, Kevrekidis, Titi et al (88-94) ) Inertial manifold M Let v = u + w . Rewrite the KSE: if M is the graph of a function du dt = Au + Pf ( u + w ) w = ψ ( u ) , du dw dt = Au + Pf ( u + ψ ( u )) . dt = Aw + Qf ( u + w ) Approximate inertial manifolds (AIMs) Approximate the function w = ψ ( u ) : dw dt ≈ 0 ⇒ w ≈ A − 1 Qf ( u + w ) , Fixed point: ψ 0 = 0; ψ n + 1 = A − 1 Qf ( u + ψ n ) . An accurate AIM requires m = dim ( u ) to be large! ( distance to the attractor ∼ | q m | − γ ) 18 / 22

  19. NARMA with AIMs ( Lu-Lin-Chorin (15) ) The AIMs hint at how the high modes depend on the low modes: � � � A − 1 Qf ( u ) ˆ k ⇒ ˆ v l ˆ ˆ | k | > K : v k ≈ ψ 1 , k = v k ≈ c k v k − l . 1 ≤| l | , | k − l |≤ K Important: keep the terms, and estimate the coefficients � u n 1 ≤ j ≤ K ; j , u n � i � K j = l = j − K u n l u n j − l , K < j ≤ 2 K . A discrete-time stochastic system: u n k = u n − 1 + δ R δ k ( u n − 1 ) + δ z n k , k z n k = Φ n k + ξ n k , p r K � � � Φ n a k , j z n − j b k , j u n − j u n u n c k , j � j + K � k = µ k + + + j + K − k k k j = 1 j = 0 j = 1 q � k ( u n ) + d k , j ξ n − j + c k , ( K + 1 ) R δ . k j = 1 19 / 22

  20. Long-term statistics: data data truncated system truncated system 0.8 NARMAX NARMAX 0 10 0.6 ACF pdf 0.4 0.2 − 2 10 0 − 0.2 − 0.4 − 0.2 0 0.2 0.4 0.6 0 10 20 30 40 50 v x 4 time lead time auto-correlation function probability density function 20 / 22

  21. Short-term prediction A typical forecast: RMSE of many forecasts: 0.5 the truncated system 15 v 4 0 the truncated system − 0.5 RMSE 10 20 40 60 80 ensemble size = 1 0.4 NARMA 0.2 5 NARMA 0 v 4 − 0.2 ensemble size = 20 − 0.4 0 20 40 60 80 20 40 60 80 lead time time time t Forecast time: the truncated system: t ≈ 15 the NARMA system: t ≈ 50 21 / 22

  22. Summary Data-driven stochastic model reduction: High-dimensional Full system Discrete-time stochastic Prediction reduced system Discrete partial data X n = Φ( X n − p : n − 1 , ξ n − q : n − 1 ) + Ψ( X n − p : n − 1 , ξ n − q : n − 1 ) ξ n integrate numerical scheme into statistical inference easy parameter estimation flexible structure selection Thanks for your attention! 22 / 22

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend