[PPT] - Data Assimilation with Stochastic Model Reduction of Chaotic Systems PowerPoint Presentation

SLIDE 1

Data Assimilation with Stochastic Model Reduction of Chaotic Systems

Fei Lu

Department of Mathematics, Johns Hopkins Joint work with: Alexandre J. Chorin (UC Berkeley) Kevin K. Lin (U. of Arizona) Xuemin Tu (U. of Kansas)

Snowbird, SIAM-DS, May 23, 2019

1 / 24

SLIDE 2

Model error from sub-grid scales

ECMWF: 16 km horizontal grid → 109 freedoms From: Wikipedia, ECWMF The Lorenz 96 system Wilks 2005

2 / 24

SLIDE 3

Model error from sub-grid scales

ECMWF: 16 km horizontal grid → 109 freedoms The Lorenz 96 system Wilks 2005

Discrete partial data High-dimensional Full system Prediction

x′= f(x) + U(x, y), y′ = g(x, y). Observe only {x(nh)}N

n=1.

Forecast x(t), t ≥ Nh.

HighD multiscale full systems:

◮ can only afford to resolve x′ = f(x) ◮ y: unresolved variables (subgrid-scales)

Discrete noisy observations: missing i.c. Ensemble prediction: need many simulations

3 / 24

SLIDE 4

Model error from sub-grid scales

ECMWF: 16 km horizontal grid → 109 freedoms The Lorenz 96 system Wilks 2005

Discrete partial data High-dimensional Full system Prediction

x′= f(x) + U(x, y), y′ = g(x, y). Observe only {x(nh)}N

n=1.

Forecast x(t), t ≥ Nh.

HighD multiscale full systems:

◮ can only afford to resolve x′ = f(x) ◮ y: unresolved variables (subgrid-scales)

Discrete noisy observations: missing i.c. Ensemble prediction: need many simulations

→ How to account for the model error U(x, y)?

4 / 24

SLIDE 5

Given a highD multiscale full system: x′ = f(x) + U(x, y), y′ = g(x, y). Ensemble prediction: can afford to resolve x′ = f(x) online. Accounting model error U(x, y) from subgrid scales Indirect approaches: correct the forecast ensemble

◮ e.g. Inflation and Localization;1 relaxation/bias correction2, ◮ in Assimilation step; deficiency in forecast model remains

Direct approach: improve the forecast model

◮ parametrization methods 3, non-Markovian 4, ◮ random perturbation, averaging+homogenization 5 1Mitchell-Houtekamer(00), Hamill-Whitaker(05), Anderson(07) 2Zhang-etc (04), Dee-Da Silva(98) 3Palmer, Arnold+(01,13), Wilks(05), Meng-Zhang(07), Danforth-Kalnay-Li+(08,09),

Berry-Harlim(14), Mitchell-Carrissi(15), Van Leeuwen etc(18)

4Chorin+(00-15),

Marjda-Timofeyev-Harlim+(03-13), Chekroun-Kondrashov- Gil+(11,15), Cromellin+Vanden-Eijinden(08), Gottwald+(15)

5Hamill-Whitaker(05),Houtekamer+(09) Pavliotis-Stuart(08), Gottwald+(12-13) 5 / 24

SLIDE 6

Outline

1. Stochastic model reduction

(reduction from simulated data)

◮ Discrete-time stochastic parametrization (NARMA)

2. Data assimilation with the reduced model

(Noisy data + reduced model → state estimation and prediction)

6 / 24

SLIDE 7

Stochastic model reduction

x′ = f(x) + U(x, y), y′ = g(x, y).

Data {x(nh)}N

n=1

Why stochastic reduced models? The system is “ergodic”:

1 N

N

n=1 F(x(nh)) N→∞

− − − − →

F(x)µ(dx)

U(x, y) acts like a stochastic force

7 / 24

SLIDE 8

Stochastic model reduction

x′ = f(x) + U(x, y), y′ = g(x, y).

Data {x(nh)}N

n=1

Why stochastic reduced models? The system is “ergodic”:

1 N

N

n=1 F(x(nh)) N→∞

− − − − →

F(x)µ(dx)

U(x, y) acts like a stochastic force Memory effects (Mori, Zwanzig, Chorin, Kubo, Majda, Wilks, Ghil, . . . ) Mori-Zwanzig formalism → generalized Langevin equ.

dx dt = E[RHS|x]

Markov term

+ t K(x(s), t − s)ds

memory

+ Wt

noise

,

Fluctuation-dissipation theory → Hypoelliptic SDEs

dX = a(X, Y)dt + Y; dY = b(X, Y)dt + c(X, Y)dW,

Parametrization: multi-layer stochastic models Goal: develop a non-Markovian stochastic reduced system for x

8 / 24

SLIDE 9

Discrete-time stochastic parametrization

NARMA(p, q)

Xn = Xn−1 + Rh(Xn−1) + Zn, Zn = Φn + ξn, Φn =

p

j=1

ajXn−j +

r

j=1

s

i=1

bi,jPi(Xn−j)

Auto-Regression

+

q

j=1

cjξn−j

Moving Average

Rh(Xn−1) from a numerical scheme for x′ ≈ f(x) Φn depends on the past Tasks: Structure derivation: terms and orders (p, r, s, q) in Φn; Parameter estimation: aj, bi,j, cj, and σ.

9 / 24

SLIDE 10

Overview:

x′ = f(x) + U(x, y), y′ = g(x, y).

Data {x(nh)}N

n=1

Discrete-time stochastic parametrization

NARMA

Xn = Xn−1 + Rh(Xn−1) + Zn, Zn = Φn + ξn, Φn =

p

j=1

ajXn−j +

q

j=1

cjξn−j +

r

j=1

s

i=1

bi,jPi(Xn−j).

1. compute Rh(x)
2. derive structure
3. estimate parameters

10 / 24

SLIDE 11

Application to the chaotic Lorenz 96 system

A chaotic dynamical system (a simplified atmospheric model) d dt xk = xk−1 (xk+1 − xk−2) − xk + 10 − 1 J

j

yk,j, d dt yk,j = 1 ε [yk,j+1(yk,j−1 − yk,j+2) − yk,j + xk], where x ∈ R18, y ∈ R360.

Wilks 2005

Find a reduced system for x ∈ R18 based on

➣ Data {x(nh)}N

n=1

➣ d

dt xk ≈ xk−1 (xk+1 − xk−2) − xk + 10. 11 / 24

SLIDE 12

NARMA: xn = xn−1 + Rh(xn−1) + zn; zn = Φn + ξn, Φn = a +

p

j=1

dx

l=1

bj,l(xn−j)l +

p

j=1

cjRh(xn−j) +

q

j=1

djξn−j. p = 2, dx = 3; q =

1,

h = 0.01; 0, h = 0.05.

6Wilks 05: an MLR model in atmosphere science 12 / 24

SLIDE 13

NARMA: xn = xn−1 + Rh(xn−1) + zn; zn = Φn + ξn, Φn = a +

p

j=1

dx

l=1

bj,l(xn−j)l +

p

j=1

cjRh(xn−j) +

q

j=1

djξn−j. p = 2, dx = 3; q =

1,

h = 0.01; 0, h = 0.05. Polynomial autoregression (POLYAR)6 d dt xk = xk−1 (xk+1 − xk−2) − xk + 10 + U , U = P(xk) + ηk, with dηk(t) = φηk(t) + dBk(t). where P(x) = dx

j=0 ajxj. Optimal dx = 5. 6Wilks 05: an MLR model in atmosphere science 13 / 24

SLIDE 14

Long-term statistics

Empirical probability density function (PDF)

−5 5 10 0.02 0.04 0.06 0.08 0.1 X1 PDF L96 POLYAR NARMA

h=0.01

−5 5 10 0.02 0.04 0.06 0.08 0.1 0.12 X1 PDF L96 POLYAR NARMA

h=0.05

Empirical autocorrelation function (ACF)

2 4 6 8 10 −0.2 0.2 0.4 0.6 0.8 time ACF L96 POLYAR NARMA

h=0.01

2 4 6 8 10 −0.5 0.5 1 time ACF L96 POLYAR NARMA

h=0.05 14 / 24

SLIDE 15

Prediction (h = 0.05) A typical ensemble forecast:

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 −5 5 10

time X1

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 −5 5 10

time X1

POLYAR NARMAX

forecast trajectories in cyan true trajectory in blue

15 / 24

SLIDE 16

Prediction (h = 0.05) A typical ensemble forecast:

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 −5 5 10

time X1

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 −5 5 10

time X1

POLYAR NARMAX

forecast trajectories in cyan true trajectory in blue

RMSE of many forecasts:

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 2 4 6 8 10 12 14 16 18 20

lead time RMSE

POLYAR NARMAX

Forecast time: POLYAR: T ≈ 1.25 NARMA: T ≈ 2.5 (Full model: T ≈ 2.5 “Best” forecast time achieved! )

16 / 24

SLIDE 17

Outline

1. Stochastic model reduction

(reduction from simulated data)

◮ Discrete-time stochastic parametrization (NARMA)

2. Data assimilation with the reduced model

(Noisy data + reduced model → state estimation and prediction)

17 / 24

SLIDE 18

Data assimilation with the reduced model

x′ = f(x) + U(x, y), y′ = g(x, y).

Noisy data: x(nh) + W(n), n = 1, 2, . . . Data assimilation: estimate the state of a forward model make prediction (by ensembles of solutions) The widely used method: Ensemble Kalman filters (EnKF)

18 / 24

SLIDE 19

The Lorenz 96 system

Wilks 2005

Estimate and predict x based on

➣ Noisy Data z(n) = x(nh) + W(n) ➣ Forward models L96x: the truncated model

d dt xk ≈ xk−1 (xk+1 − xk−2) − xk + 10

(account for the model error by IL in EnKF ) NARMA (account for the model error by parametrization in the forward model)

19 / 24

SLIDE 20

Relative error of state estimation

0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.005 0.01 0.015 0.02 0.025 0.03 std of observation noise relative error L96x + IL NARMA Full model + IL

Relative error for different observation noises. (ensemble size: =1000 for L96x and NARMA; =10 for the full model)

20 / 24

SLIDE 21

RMSE of state prediction

1 2 3 4 2 4 6 8 10 12 14 16 time RMSE L96x + IL NARMA Full model + IL

RMSE of 104 ensemble forecasts. (ensemble size: =1000 for L96x and NARMA; =10 for the full model)

Summary: The stochastic model improves performance of DA.

21 / 24

SLIDE 22

Summary and ongoing work

Accounting model error U(x, y) from subgrid scales

x′ = f(x) + U(x,y), y′= g(x,y).

Data {x(nh)}N

n=1

“X ′ = f(X) + Z(t, ω)” Inference “Xn+1 = Xn + Rh(Xn) + Zn ” for prediction Discretization Inference

Stochastic model reduction by Discrete-time stochastic parametrization

◮ simplifies the inference from data ◮ incorporates memory flexibly ◮ effective reduced model (NARMA) ◮ capture key statistical-dynamical features ◮ make medium-range forecasting

Improve the forecast model → Improve performance of DA

22 / 24

SLIDE 23

Ongoing work: noisy data: state estimation and model inference

◮ data assimilation with non-Markovian models ◮ inference for hidden non-Markovian models

model reduction for (stochastic) PDEs

◮ stochastic Burgers equation, N-S equation 23 / 24

SLIDE 24

References

Data-driven stochastic model reduction

◮ Chorin-Lu: Discrete approach to stochastic parametrization and dimension

reduction in nonlinear dynamics. PNAS 112 (2015), no. 32, 9804–9809.

◮ Lu-Lin-Chorin: Comparison of continuous and discrete-time data-based

modeling for hypoelliptic systems. CAMCoS, 11 (2016), no. 8, 4227–4246. (With K. Lin and A. Chorin).

◮ Lu-Lin-Chorin: Data-based stochastic model reduction for the Kuramoto –

Sivashinsky equation. Physica D, 340 (2017), 46–57. (With K. Lin and

A. Chorin)

Data assimilation

◮ Lu-Tu-Chorin: Accounting for model error from unresolved scales in

EnKFs: improving the forecast model. MWR, 340 (2017).

Thank you!

24 / 24