Stochastic model reduction: from nonlinear Galerkin to parametric - PowerPoint PPT Presentation

Stochastic model reduction: from nonlinear Galerkin to parametric inference Fei Lu Department of Mathematics, Johns Hopkins Joint work with: Alexandre J. Chorin (UC Berkeley) Kevin K. Lin (U. of Arizona) May 22, 2019 SIAM DS19, Snowbird 1 / 24

Consider dissipative PDEs in operator form: v t = + B ( v ) + f , Av �� self − adjoint nonlinear Examples: Burgers v t = ν v xx − vv x + f ( x , t ) , Kuramoto-Sivashinsky: v t = − v xx − ν v xxxx − vv x 2 / 24

To resolve the Eq. by Fourier-Galerkin (when periodic BC) � d v k + ik v k − l + � dt � v k = − q ν k � v l � � f k ( t ) , 2 | l |≤ N , | k − l |≤ N Need: N � 5 /ν Fourier modes, dt ∼ 1 / N . E.g. ν = 10 − 4 : spatial grid= 5 × 10 4 , time steps= 5 T × 10 4 We are mainly interested in large scales, K << N . Question: a reduced model for ( � v 1 : K ) ? Reduce spatial dimension + Increase time step-size 3 / 24

Motivation: data assimilation in weather/climate prediction High-dimensional Discrete partial Prediction Full system data x ′ = f ( x ) + U ( x , y ) , Observe only Forecast y ′ = g ( x , y ) . { x ( nh ) } N n = 1 . x ( t ) , t ≥ Nh . HighD multiscale full chaotic/ergodic systems: ◮ can only afford to resolve x ′ = f ( x ) online ◮ y : unresolved variables (subgrid-scales) Discrete noisy observations: missing i.c. Ensemble prediction: need many simulations 4 / 24

x ′ = f ( x ) + U ( x , y ) , y ′ = g ( x , y ) . Data { x ( nh ) } N n = 1 Objective: Develop a closed reduced model of x that captures key statistical + dynamical properties use it for online state estimation and prediction [Approximate the stochastic process ( x ( t ) , t > 0 ) in distribution.] 5 / 24

Various efforts in closure model reduction: Direct constructions: ◮ non-linear/Petrov- Galerkin: y ( t ) = F ( x ( t )) ◮ Mori-Zwanzig formalism (memory) → statistical approximation by a non-Markov process ◮ relaxation approximations ◮ linear response / filtering / feedback control ◮ . . . Inference/Data-driven ROM ◮ hypoellitpic SDEs, GLEs and SDDEs ◮ discrete-time (time series) models ◮ data-driven: POD, DMD, Kooperman operator ◮ nonparametric inference ◮ machine learning (NN’s) . . . 6 / 24

Inference-based model reduction SDEs or time series – dynamical models 7 / 24

Differential system or discrete-time system? X ′ = f ( X ) + Z ( t , ω ) X n + 1 = X n + R h ( X n ) + Z n informative non-intrusive Inference 1 likelihood Discretization 2 error correction by data − − − − − − − − − 1 Brockwell, Sørensen, Pokern, Wiberg, Samson,. . . 2 Milstein, Tretyakov, Talay, Mattingly, Stuart, Higham, . . . 8 / 24

Discrete-time stochastic parametrization NARMA( p , q ) [Chorin-Lu (15)] X n = X n − 1 + R h ( X n − 1 ) + Z n , Z n = Φ n + ξ n , p q r s � � � � Φ n = a j X n − j + b i , j P i ( X n − j ) + c j ξ n − j j = 1 j = 1 i = 1 j = 1 � �� Auto-Regression Moving Average R h ( X n − 1 ) from a numerical scheme for x ′ ≈ f ( x ) Φ n depends on the past NARMAX in system identification Z n = Φ( Z , X ) + ξ n , Tasks: Structure derivation: terms and orders ( p , r , s , q ) in Φ n ; Parameter estimation: a j , b i , j , c j , and σ . Conditional MLE 9 / 24

Model reduction for dissipative PDEs by parametric inference 10 / 24

Kuramoto-Sivashinsky: v t = − v xx − ν v xxxx − vv x Burgers: v t = ν v xx − vv x + f ( x , t ) , Goal: a closed model for ( � v 1 : K ) , K = 2 K 0 << N . � d v k + ik v k − l + � v k = − q ν dt � k � � v l � f k ( t ) , 2 | l |≤ K , | k − l |≤ K � + ik � v l � v k − l 2 | l | > K or | k − l | > K View ( � v 1 : K ) ∼ x , ( � v k > K ) ∼ y : x ′ = f ( x ) + U ( x , y ) , y ′ = g ( x , y ) . TODO: represent the effects of high modes to the low modes 11 / 24

Derivation of a parametric form (KSE) Let v = u + w . In operator form: v t = Av + B ( v ) , du dt = PAu + PB ( u ) + [ PB ( u + w ) − PB ( u )] dw dt = QAw + QB ( u + w ) Nonlinear Galerkin: approximate inertial manifold (IM) 1 dw dt ≈ 0 ⇒ w ≈ A − 1 QB ( u + w ) ⇒ w ≈ ψ ( u ) Need: spectral gap condition ; dim = ( u ) > K : parametrization with time delay (Lu-Lin17) 1 Foias, Constantin, Temam, Sell, Jolly, Kevrekidis, Titi et al (88-94) 12 / 24

Derivation of a parametric form (KSE) Let v = u + w . In operator form: v t = Av + B ( v ) , du dt = PAu + PB ( u ) + [ PB ( u + w ) − PB ( u )] dw dt = QAw + QB ( u + w ) Nonlinear Galerkin: approximate inertial manifold (IM) 1 dw dt ≈ 0 ⇒ w ≈ A − 1 QB ( u + w ) ⇒ w ≈ ψ ( u ) Need: spectral gap condition ; dim = ( u ) > K : parametrization with time delay (Lu-Lin17) A time series (NARMA) model of the form u n k = R δ ( u n − 1 ) + g n k + Φ n k , k with Φ n k := Φ n k ( u n − p : n − 1 , f n − p : n − 1 ) in form of p � � k , j u n − j k , j R δ ( u n − j u n − j Φ n c v + c R ) + c w u n − 1 � � k = k , j k k l k − l j = 1 | k − l |≤ K , K < | l |≤ 2 K or | l |≤ K , K < | k − l |≤ 2 K 1 Foias, Constantin, Temam, Sell, Jolly, Kevrekidis, Titi et al (88-94) 13 / 24

Test setting: ν = 3 . 43 N = 128, dt = 0 . 001 Reduced model: K = 5, δ = 100 dt 3 unstable modes 2 stable modes Long-term statistics: Data Data Truncated system Truncated system 0.8 NARMA NARMA 0 10 0.6 pdf ACF 0.4 0.2 − 2 10 0 − 0.2 − 0.4 − 0.2 0 0.2 0.4 0.6 0 10 20 30 40 50 Real v 4 time probability density function auto-correlation function 14 / 24

Prediction A typical forecast: RMSE of many forecasts: 0.5 the truncated system 15 v 4 0 − 0.5 RMSE the truncated system 10 20 40 60 80 0.4 NARMA 0.2 5 NARMA 0 v 4 − 0.2 − 0.4 0 20 40 60 80 20 40 60 80 time t lead time Forecast time: the truncated system: T ≈ 5 the NARMA system: T ≈ 50 ( ≈ 2 Lyapunov time) 15 / 24

Derivation of a parametric form: stochastic Burgers Let v = u + w . In operator form: du dt = PAu + PB ( u ) + Pf + [ PB ( u + w ) − PB ( u )] dw dt = QAw + QB ( u + w ) spectral gap: Burgers ? (likely not) w ( t ) is not function of u ( t ) , but a functional of its path 16 / 24

Derivation of a parametric form: stochastic Burgers Let v = u + w . In operator form: du dt = PAu + PB ( u ) + Pf + [ PB ( u + w ) − PB ( u )] dw dt = QAw + QB ( u + w ) spectral gap: Burgers ? (likely not) w ( t ) is not function of u ( t ) , but a functional of its path Integration instead: � t w ( t ) = e − QAt w ( 0 ) + e − QA ( t − s ) [ QB ( u ( s ) + w ( s ))] ds 0 w n ≈ c 0 QB ( u n ) + c 1 QB ( u n − 1 ) + · · · + c p QB ( u n − p ) Linear in parameter approximation: PB ( u + w ) − PB ( u ) = P [( uw ) x + ( u 2 ) x ] / 2 ≈ P [( uw ) x ] / 2 + noise p � c j P [( u n QB ( u n − j )) x ] + noise ≈ j = 0 17 / 24

A time series (NARMA) model of the form k = R δ ( u n − 1 u n ) + f n k + g n k + Φ n k , k with Φ n k := Φ n k ( u n − p : n − 1 , f n − p : n − 1 ) in form of p � � k , j u n − j k , j R δ ( u n − j u n − j Φ n c v + c R ) + c w u n − 1 � � k = k , j k − l k k l j = 1 | k − l |≤ K , K < | l |≤ 2 K or | l |≤ K , K < | k − l |≤ 2 K 18 / 24

Numerical tests: ν = 0 . 05, K 0 = 4 → random shocks Spectrum 10 0 True Truncated NAR Spectrum 10 -1 Full model: N = 128 , dt = 0 . 005 10 -2 1 2 3 4 5 6 7 8 Wavenumber Reduced model: K = 8, δ = 20 dt Energy spectrum 19 / 24

2 | 2 ,|u k | 2 ) k=1 2 | 2 ,|u k | 2 ) k=2 cov(|u cov(|u 10 -3 20 0.06 True ACF 0.04 10 Truncated 0.02 NAR 0 0 0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5 2 | 2 ,|u k | 2 ) k=3 2 | 2 ,|u k | 2 ) k=4 cov(|u cov(|u 10 -3 10 -3 2 2 ACF 0 1 -2 0 -4 -1 0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5 2 | 2 ,|u k | 2 ) k=5 2 | 2 ,|u k | 2 ) k=6 cov(|u cov(|u 10 -4 10 -4 20 20 ACF 10 10 0 0 0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5 2 | 2 ,|u k | 2 ) k=7 2 | 2 ,|u k | 2 ) k=8 cov(|u cov(|u 10 -4 10 -3 20 4 ACF 10 2 0 0 0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5 Time Lag Time Lag Cross-ACF of energy (4th moments!) 20 / 24

Abs of Mode k=1 Abs of Mode k=2 1 0.5 True 0.5 Truncated 0 0 NAR -0.5 -0.5 -1 0 5 10 15 20 25 0 5 10 15 20 25 Abs of Mode k=3 Abs of Mode k=4 0.4 1 0.2 0.5 0 0 -0.2 -0.4 -0.5 -0.6 0 5 10 15 20 25 0 5 10 15 20 25 Abs of Mode k=5 Abs of Mode k=6 0.5 0.4 0.2 0 0 -0.2 -0.4 -0.5 0 5 10 15 20 25 0 5 10 15 20 25 Abs of Mode k=7 Abs of Mode k=8 0.5 0.5 0 0 -0.5 -0.5 0 5 10 15 20 25 0 5 10 15 20 25 Time Time Trajectory prediction in response to force 21 / 24

Summary and ongoing work x ′ = f(x) + U(x,y), y ′ = g(x,y). Data { x ( nh ) } N n = 1 Inference-based stochastic model reduction Inference non-intrusive time series ( NARMA ) “ X ′ = f ( X ) + Z ( t , ω ) ” Inference parametrize projections on path space Discretization → Effective stochastic reduced model “ X n + 1 = X n + R h ( X n ) + Z n ” for prediction 22 / 24

Open problems: model reduction: model selection post-processing theoretical understanding of the approximation ◮ distance between the two stochastic processes? 23 / 24

Stochastic model reduction: from nonlinear Galerkin to parametric - PowerPoint PPT Presentation

Stochastic model reduction: from nonlinear Galerkin to parametric inference Fei Lu Department of Mathematics, Johns Hopkins Joint work with: Alexandre J. Chorin (UC Berkeley) Kevin K. Lin (U. of Arizona) May 22, 2019 SIAM DS19, Snowbird 1 /

Conjugate gradient methods for stochastic Galerkin finite element saddle point matrices B T A

Nonlinear Control Lecture # 31 Nonlinear Observers Nonlinear Control Lecture # 31 Nonlinear

Nonlinear Control Lecture # 22 Special nonlinear Forms Nonlinear Control Lecture # 22 Special

Nonlinear Control Lecture # 21 Special nonlinear Forms Nonlinear Control Lecture # 21 Special

An Operator Splitting Based Stochastic Galerkin Method for Nonlinear Systems of Hyperbolic

Nonlinear Control Lecture # 8 Special nonlinear Forms Nonlinear Control Lecture # 8 Special

Nonlinear Control Lecture # 12 Nonlinear Observers and Output Feedback Stabilization Nonlinear

Nonlinear Control Lecture # 20 Special nonlinear Forms Nonlinear Control Lecture # 20 Special

Nonlinear Control Lecture # 1 Introduction Nonlinear Control Lecture # 1 Introduction Nonlinear

Numerical Proofs in Nonlinear Control Sicun Gao, UCSD Nonlinear control working Nonlinear

Bias reduction in generalized nonlinear models Ioannis Kosmidis and David Firth Department of

Nonlinear Dimensionality Reduction Donovan Parks Overview Direct visualization vs.

Error estimates for the Galerkin finite element approximation for a linear second order hyperbolic

Energy Stable Discontinuous Galerkin Methods for Maxwells Equations in Nonlinear Optical Media

Novel non-hydrostatic numerical schemes based on discontinuous Galerkin finite element method

C++ Polymorphism for Weak Galerkin (WG) Finite Element Methods on Polytopal Meshes Jiangguo

Tardiness Bounds under Global EDF Scheduling on a Multiprocessor UmaMaheswari C. Devi and

10. Left-associative grammar (LAG) 10.1 Rule types and derivation order 10.1.1 The notion

Matrix Models, Check-operators & Quantum Spectral Curves Andrei Mironov P.N.Lebedev Physics

Adaptive Signal Recovery by Convex Optimization Dmitrii Ostrovskii CWI, Amsterdam 19 April 2018

The Matrix Cookbook [ http://matrixcookbook.com ] Kaare Brandt Petersen Michael Syskind Pedersen

Numerical Solutions to Partial Differential Equations Zhiping Li LMAM and School of Mathematical

Lagrange Multipliers Math 115 Calculus 115 How to deal with constrained optimization. Calculus

7.0 Equality Contraints: Lagrange Multipliers Consider the minimization of a non-linear function

Stochastic model reduction: from nonlinear Galerkin to parametric - PowerPoint PPT Presentation

Stochastic model reduction: from nonlinear Galerkin to parametric inference Fei Lu Department of Mathematics, Johns Hopkins Joint work with: Alexandre J. Chorin (UC Berkeley) Kevin K. Lin (U. of Arizona) May 22, 2019 SIAM DS19, Snowbird 1 /

Conjugate gradient methods for stochastic Galerkin finite element saddle point matrices B T A

Nonlinear Control Lecture # 31 Nonlinear Observers Nonlinear Control Lecture # 31 Nonlinear

Nonlinear Control Lecture # 22 Special nonlinear Forms Nonlinear Control Lecture # 22 Special

Nonlinear Control Lecture # 21 Special nonlinear Forms Nonlinear Control Lecture # 21 Special

An Operator Splitting Based Stochastic Galerkin Method for Nonlinear Systems of Hyperbolic

Nonlinear Control Lecture # 8 Special nonlinear Forms Nonlinear Control Lecture # 8 Special

Nonlinear Control Lecture # 12 Nonlinear Observers and Output Feedback Stabilization Nonlinear

Nonlinear Control Lecture # 20 Special nonlinear Forms Nonlinear Control Lecture # 20 Special

Nonlinear Control Lecture # 1 Introduction Nonlinear Control Lecture # 1 Introduction Nonlinear

Numerical Proofs in Nonlinear Control Sicun Gao, UCSD Nonlinear control working Nonlinear

Bias reduction in generalized nonlinear models Ioannis Kosmidis and David Firth Department of

Nonlinear Dimensionality Reduction Donovan Parks Overview Direct visualization vs.

Error estimates for the Galerkin finite element approximation for a linear second order hyperbolic

Energy Stable Discontinuous Galerkin Methods for Maxwells Equations in Nonlinear Optical Media

Novel non-hydrostatic numerical schemes based on discontinuous Galerkin finite element method

C++ Polymorphism for Weak Galerkin (WG) Finite Element Methods on Polytopal Meshes Jiangguo

Tardiness Bounds under Global EDF Scheduling on a Multiprocessor UmaMaheswari C. Devi and

10. Left-associative grammar (LAG) 10.1 Rule types and derivation order 10.1.1 The notion

Matrix Models, Check-operators &amp; Quantum Spectral Curves Andrei Mironov P.N.Lebedev Physics

Adaptive Signal Recovery by Convex Optimization Dmitrii Ostrovskii CWI, Amsterdam 19 April 2018

The Matrix Cookbook [ http://matrixcookbook.com ] Kaare Brandt Petersen Michael Syskind Pedersen

Numerical Solutions to Partial Differential Equations Zhiping Li LMAM and School of Mathematical

Lagrange Multipliers Math 115 Calculus 115 How to deal with constrained optimization. Calculus

7.0 Equality Contraints: Lagrange Multipliers Consider the minimization of a non-linear function

Matrix Models, Check-operators & Quantum Spectral Curves Andrei Mironov P.N.Lebedev Physics