Stochastic model reduction: from nonlinear Galerkin to parametric - - PowerPoint PPT Presentation

stochastic model reduction from nonlinear galerkin to
SMART_READER_LITE
LIVE PREVIEW

Stochastic model reduction: from nonlinear Galerkin to parametric - - PowerPoint PPT Presentation

Stochastic model reduction: from nonlinear Galerkin to parametric inference Fei Lu Department of Mathematics, Johns Hopkins Joint work with: Alexandre J. Chorin (UC Berkeley) Kevin K. Lin (U. of Arizona) May 22, 2019 SIAM DS19, Snowbird 1 /


slide-1
SLIDE 1

Stochastic model reduction: from nonlinear Galerkin to parametric inference

Fei Lu

Department of Mathematics, Johns Hopkins Joint work with: Alexandre J. Chorin (UC Berkeley) Kevin K. Lin (U. of Arizona)

May 22, 2019 SIAM DS19, Snowbird

1 / 24

slide-2
SLIDE 2

Consider dissipative PDEs in operator form: vt = Av

  • self−adjoint

+ B(v)

  • nonlinear

+f, Examples: Burgers vt = νvxx − vvx + f(x, t), Kuramoto-Sivashinsky: vt = −vxx − νvxxxx − vvx

2 / 24

slide-3
SLIDE 3

To resolve the Eq. by Fourier-Galerkin (when periodic BC)

d dt vk = −qν

k

vk + ik 2

  • |l|≤N,|k−l|≤N
  • vl

vk−l + fk(t),

Need: N 5/ν Fourier modes, dt ∼ 1/N. E.g. ν = 10−4: spatial grid= 5 × 104, time steps= 5T × 104 We are mainly interested in large scales, K << N. Question: a reduced model for ( v1:K)? Reduce spatial dimension + Increase time step-size

3 / 24

slide-4
SLIDE 4

Motivation: data assimilation in weather/climate prediction

Discrete partial data High-dimensional Full system Prediction

x′= f(x) + U(x, y), y′ = g(x, y). Observe only {x(nh)}N

n=1.

Forecast x(t), t ≥ Nh.

HighD multiscale full chaotic/ergodic systems:

◮ can only afford to resolve x′ = f(x) online ◮ y: unresolved variables (subgrid-scales)

Discrete noisy observations: missing i.c. Ensemble prediction: need many simulations

4 / 24

slide-5
SLIDE 5

x′ = f(x) + U(x, y), y′ = g(x, y).

Data {x(nh)}N

n=1

Objective: Develop a closed reduced model of x that captures key statistical + dynamical properties use it for online state estimation and prediction

[Approximate the stochastic process (x(t), t > 0) in distribution.]

5 / 24

slide-6
SLIDE 6

Various efforts in closure model reduction: Direct constructions:

◮ non-linear/Petrov- Galerkin: y(t) = F(x(t)) ◮ Mori-Zwanzig formalism (memory)

→ statistical approximation by a non-Markov process

◮ relaxation approximations ◮ linear response / filtering / feedback control ◮ . . .

Inference/Data-driven ROM

◮ hypoellitpic SDEs, GLEs and SDDEs ◮ discrete-time (time series) models ◮ data-driven: POD, DMD, Kooperman operator ◮ nonparametric inference ◮ machine learning (NN’s) . . . 6 / 24

slide-7
SLIDE 7

Inference-based model reduction SDEs or time series – dynamical models

7 / 24

slide-8
SLIDE 8

Differential system or discrete-time system? X ′ = f(X) + Z(t, ω) Xn+1 = Xn + Rh(Xn) + Zn informative non-intrusive Inference1 likelihood Discretization2 error correction by data − − − − − − − − −

1Brockwell, Sørensen, Pokern, Wiberg, Samson,. . . 2Milstein, Tretyakov, Talay, Mattingly, Stuart, Higham, . . . 8 / 24

slide-9
SLIDE 9

Discrete-time stochastic parametrization

NARMA(p, q) [Chorin-Lu (15)] Xn = Xn−1 + Rh(Xn−1) + Zn, Zn = Φn + ξn, Φn =

p

  • j=1

ajXn−j +

r

  • j=1

s

  • i=1

bi,jPi(Xn−j)

  • Auto-Regression

+

q

  • j=1

cjξn−j

  • Moving Average

Rh(Xn−1) from a numerical scheme for x′ ≈ f(x) Φn depends on the past NARMAX in system identification Zn = Φ(Z, X) + ξn, Tasks: Structure derivation: terms and orders (p, r, s, q) in Φn; Parameter estimation: aj, bi,j, cj, and σ. Conditional MLE

9 / 24

slide-10
SLIDE 10

Model reduction for dissipative PDEs by parametric inference

10 / 24

slide-11
SLIDE 11

Kuramoto-Sivashinsky: vt = −vxx − νvxxxx − vvx Burgers: vt = νvxx − vvx + f(x, t), Goal: a closed model for ( v1:K), K = 2K0 << N.

d dt vk = −qν

k

vk + ik 2

  • |l|≤K,|k−l|≤K
  • vl

vk−l + fk(t), + ik 2

  • |l|>K or |k−l|>K
  • vl

vk−l

View ( v1:K) ∼ x, ( vk>K) ∼ y:

x′ = f(x) + U(x, y), y′ = g(x, y).

TODO: represent the effects of high modes to the low modes

11 / 24

slide-12
SLIDE 12

Derivation of a parametric form (KSE)

Let v = u + w. In operator form: vt = Av + B(v), du dt = PAu + PB(u) + [PB(u + w) − PB(u)] dw dt = QAw + QB(u + w) Nonlinear Galerkin: approximate inertial manifold (IM)1

dw dt ≈ 0 ⇒ w ≈ A−1QB(u + w) ⇒ w ≈ ψ(u)

Need: spectral gap condition ; dim = (u) > K: parametrization with time delay (Lu-Lin17)

1Foias, Constantin, Temam, Sell, Jolly, Kevrekidis, Titi et al (88-94) 12 / 24

slide-13
SLIDE 13

Derivation of a parametric form (KSE)

Let v = u + w. In operator form: vt = Av + B(v), du dt = PAu + PB(u) + [PB(u + w) − PB(u)] dw dt = QAw + QB(u + w) Nonlinear Galerkin: approximate inertial manifold (IM)1

dw dt ≈ 0 ⇒ w ≈ A−1QB(u + w) ⇒ w ≈ ψ(u)

Need: spectral gap condition ; dim = (u) > K: parametrization with time delay (Lu-Lin17) A time series (NARMA) model of the form un

k = Rδ(un−1 k

) + gn

k + Φn k,

with Φn

k := Φn k(un−p:n−1, f n−p:n−1) in form of

Φn

k = p

  • j=1

cv

k,jun−j k

+ cR

k,jRδ(un−j k

) + cw

k,j

  • |k−l|≤K,K<|l|≤2K
  • r |l|≤K,K<|k−l|≤2K
  • un−1

l

  • un−j

k−l

1Foias, Constantin, Temam, Sell, Jolly, Kevrekidis, Titi et al (88-94) 13 / 24

slide-14
SLIDE 14

Test setting: ν = 3.43 N = 128, dt = 0.001 Reduced model: K = 5,δ = 100dt 3 unstable modes 2 stable modes Long-term statistics:

−0.4 −0.2 0.2 0.4 0.6 10

−2

10 Real v4 pdf Data Truncated system NARMA

probability density function

10 20 30 40 50 −0.2 0.2 0.4 0.6 0.8 time ACF Data Truncated system NARMA

auto-correlation function

14 / 24

slide-15
SLIDE 15

Prediction A typical forecast:

20 40 60 80 −0.5 0.5 v4 20 40 60 80 −0.4 −0.2 0.2 0.4 time t v4 the truncated system NARMA

RMSE of many forecasts:

20 40 60 80 5 10 15 lead time RMSE

NARMA the truncated system

Forecast time: the truncated system: T ≈ 5 the NARMA system: T ≈ 50 (≈ 2 Lyapunov time)

15 / 24

slide-16
SLIDE 16

Derivation of a parametric form: stochastic Burgers Let v = u + w. In operator form:

du dt = PAu + PB(u) + Pf + [PB(u + w) − PB(u)] dw dt = QAw + QB(u + w)

spectral gap: Burgers ? (likely not) w(t) is not function of u(t), but a functional of its path

16 / 24

slide-17
SLIDE 17

Derivation of a parametric form: stochastic Burgers Let v = u + w. In operator form:

du dt = PAu + PB(u) + Pf + [PB(u + w) − PB(u)] dw dt = QAw + QB(u + w)

spectral gap: Burgers ? (likely not) w(t) is not function of u(t), but a functional of its path Integration instead:

w(t) = e−QAtw(0) + t e−QA(t−s)[QB(u(s) + w(s))]ds wn ≈ c0QB(un) + c1QB(un−1) + · · · + cpQB(un−p)

Linear in parameter approximation:

PB(u + w) − PB(u) = P[(uw)x + (u2)x]/2 ≈ P[(uw)x]/2 + noise ≈

p

  • j=0

cjP[(unQB(un−j))x] + noise

17 / 24

slide-18
SLIDE 18

A time series (NARMA) model of the form un

k = Rδ(un−1 k

) + f n

k + gn k + Φn k,

with Φn

k := Φn k(un−p:n−1, f n−p:n−1) in form of

Φn

k = p

  • j=1

cv

k,jun−j k

+ cR

k,jRδ(un−j k

) + cw

k,j

  • |k−l|≤K,K<|l|≤2K
  • r |l|≤K,K<|k−l|≤2K
  • un−1

l

  • un−j

k−l

18 / 24

slide-19
SLIDE 19

Numerical tests: ν = 0.05, K0 = 4 → random shocks

Full model: N = 128, dt = 0.005 Reduced model: K = 8, δ = 20dt

1 2 3 4 5 6 7 8 Wavenumber 10-2 10-1 100 Spectrum Spectrum True Truncated NAR

Energy spectrum

19 / 24

slide-20
SLIDE 20

0.5 1 1.5 2 2.5 10 20

ACF

10-3

cov(|u

2|2,|uk|2) k=1

0.5 1 1.5 2 2.5 0.02 0.04 0.06

cov(|u

2|2,|uk|2) k=2 True Truncated NAR

0.5 1 1.5 2 2.5

  • 4
  • 2

2

ACF

10-3

cov(|u

2|2,|uk|2) k=3

0.5 1 1.5 2 2.5

  • 1

1 2 10-3

cov(|u

2|2,|uk|2) k=4

0.5 1 1.5 2 2.5 10 20

ACF

10-4

cov(|u

2|2,|uk|2) k=5

0.5 1 1.5 2 2.5 10 20 10-4

cov(|u

2|2,|uk|2) k=6

0.5 1 1.5 2 2.5

Time Lag

10 20

ACF

10-4

cov(|u

2|2,|uk|2) k=7

0.5 1 1.5 2 2.5

Time Lag

2 4 10-3

cov(|u

2|2,|uk|2) k=8

Cross-ACF of energy (4th moments!)

20 / 24

slide-21
SLIDE 21

5 10 15 20 25

  • 0.5

0.5 1

Abs of Mode k=1

True Truncated NAR

5 10 15 20 25

  • 1
  • 0.5

0.5

Abs of Mode k=2

5 10 15 20 25

  • 0.5

0.5 1

Abs of Mode k=3

5 10 15 20 25

  • 0.6
  • 0.4
  • 0.2

0.2 0.4

Abs of Mode k=4

5 10 15 20 25

  • 0.5

0.5

Abs of Mode k=5

5 10 15 20 25

  • 0.4
  • 0.2

0.2 0.4

Abs of Mode k=6

5 10 15 20 25

Time

  • 0.5

0.5

Abs of Mode k=7

5 10 15 20 25

Time

  • 0.5

0.5

Abs of Mode k=8

Trajectory prediction in response to force

21 / 24

slide-22
SLIDE 22

Summary and ongoing work

x′ = f(x) + U(x,y), y′= g(x,y).

Data {x(nh)}N

n=1

“X ′ = f(X) + Z(t, ω)” Inference “Xn+1 = Xn + Rh(Xn) + Zn ” for prediction Discretization Inference

Inference-based stochastic model reduction non-intrusive time series (NARMA) parametrize projections on path space → Effective stochastic reduced model

22 / 24

slide-23
SLIDE 23

Open problems: model reduction: model selection post-processing theoretical understanding of the approximation

◮ distance between the two stochastic processes? 23 / 24

slide-24
SLIDE 24

References

Data-driven stochastic model reduction

◮ Chorin-Lu: Discrete approach to stochastic parametrization and dimension

reduction in nonlinear dynamics. PNAS 112 (2015), no. 32, 9804–9809.

◮ Lu-Lin-Chorin: Comparison of continuous and discrete-time data-based

modeling for hypoelliptic systems. CAMCoS, 11 (2016), no. 8, 4227–4246.

◮ Lu-Lin-Chorin: Data-based stochastic model reduction for the Kuramoto –

Sivashinsky equation. Physica D, 340 (2017), 46–57.

◮ Lin-Lu: Data-driven model reduction, Wiener projections, and the

Mori-Zwanzig formalism. preprint (2019)

Data assimilation

◮ Lu-Tu-Chorin: Accounting for model error from unresolved scales in

EnKFs: improving the forecast model. MWR, 340 (2017).

Thank you!

FL acknowledges supports from JHU, LBL, NSF

24 / 24