Combining Data Assimilation and Machine Learning to emulate a - - PowerPoint PPT Presentation

combining data assimilation and machine learning to
SMART_READER_LITE
LIVE PREVIEW

Combining Data Assimilation and Machine Learning to emulate a - - PowerPoint PPT Presentation

Combining Data Assimilation and Machine Learning to emulate a numerical model Julien Brajard, Alberto Carrassi, Marc Bocquet, Laurent Bertino 05 June 2019 NERSC, LOCEAN-IPSL-Sorbonne Universit, CEREA 1 Motivation Chloropyhll-a (Model) July


slide-1
SLIDE 1

Combining Data Assimilation and Machine Learning to emulate a numerical model

Julien Brajard, Alberto Carrassi, Marc Bocquet, Laurent Bertino 05 June 2019

NERSC, LOCEAN-IPSL-Sorbonne Université, CEREA 1

slide-2
SLIDE 2

Motivation

Chloropyhll-a (Model) July 26, 2018

TOPAZ4-ECOSMO forecast

  • Unresolved process
  • Unknown parameters

Chloropyhll-a (Observation) July 26, 2018

MODIS Aqua

  • Sparse
  • Noisy

2

slide-3
SLIDE 3

Motivation

Chloropyhll-a (Model) July 26, 2018

TOPAZ4-ECOSMO forecast

  • Unresolved process
  • Unknown parameters

Chloropyhll-a (Observation) July 26, 2018

MODIS Aqua

  • Sparse
  • Noisy

2

slide-4
SLIDE 4

Motivation

Chloropyhll-a (Model) July 26, 2018

TOPAZ4-ECOSMO forecast

  • Unresolved process
  • Unknown parameters

Chloropyhll-a (Observation) July 26, 2018

MODIS Aqua

  • Sparse
  • Noisy

2

slide-5
SLIDE 5

Typology of problems and approaches

State State + Model pa- rameters State + co- efficients

  • f the ODE

State + Emulator Emulator Data Assimilation DA+ML Machine Learning

3

slide-6
SLIDE 6

Typology of problems and approaches

State State + Model pa- rameters State + co- efficients

  • f the ODE

State + Emulator Emulator Imperfect observations

(sparse, noisy)

p e r f e c t m

  • d

e l Data Assimilation DA+ML Machine Learning

3

slide-7
SLIDE 7

Typology of problems and approaches

State State + Model pa- rameters State + co- efficients

  • f the ODE

State + Emulator Emulator Imperfect observations

(sparse, noisy)

p e r f e c t m

  • d

e l Data Assimilation DA+ML Machine Learning

3

slide-8
SLIDE 8

Typology of problems and approaches

State State + Model pa- rameters State + co- efficients

  • f the ODE

State + Emulator Emulator Imperfect observations

(sparse, noisy)

Perfect

  • bservations

p e r f e c t m

  • d

e l n

  • m
  • d

e l

(no ODE)

Data Assimilation DA+ML Machine Learning

3

slide-9
SLIDE 9

Typology of problems and approaches

State State + Model pa- rameters State + co- efficients

  • f the ODE

State + Emulator Emulator Imperfect observations

(sparse, noisy)

Perfect

  • bservations

p e r f e c t m

  • d

e l n

  • m
  • d

e l

(no ODE)

Data Assimilation DA+ML Machine Learning

3

slide-10
SLIDE 10

Typology of problems and approaches

State State + Model pa- rameters State + co- efficients

  • f the ODE

State + Emulator Emulator Imperfect observations

(sparse, noisy)

Perfect

  • bservations

p e r f e c t m

  • d

e l i m p e r f e c t m

  • d

e l n

  • m
  • d

e l

(no ODE)

Data Assimilation DA+ML Machine Learning

3

slide-11
SLIDE 11

Typology of problems and approaches

State State + Model pa- rameters State + co- efficients

  • f the ODE

State + Emulator Emulator Imperfect observations

(sparse, noisy)

Perfect

  • bservations

p e r f e c t m

  • d

e l i m p e r f e c t m

  • d

e l n

  • m
  • d

e l

(general ODE form)

n

  • m
  • d

e l

(no ODE)

Data Assimilation DA+ML Machine Learning

3

slide-12
SLIDE 12

Typology of problems and approaches

State State + Model pa- rameters State + co- efficients

  • f the ODE

State + Emulator Emulator Imperfect observations

(sparse, noisy)

Perfect

  • bservations

p e r f e c t m

  • d

e l i m p e r f e c t m

  • d

e l n

  • m
  • d

e l

(general ODE form)

n

  • m
  • d

e l

(no ODE)

Data Assimilation DA+ML Machine Learning

3

slide-13
SLIDE 13

Typology of problems and approaches

State State + Model pa- rameters State + co- efficients

  • f the ODE

State + Emulator Emulator Imperfect observations

(sparse, noisy)

Perfect

  • bservations

p e r f e c t m

  • d

e l i m p e r f e c t m

  • d

e l n

  • m
  • d

e l

(general ODE form)

n

  • m
  • d

e l

(no ODE)

Data Assimilation DA+ML Machine Learning

This talk

3

slide-14
SLIDE 14

Our Objective:

Producing an accurate and reliable emulator

  • f a numerical model given sparse and noisy
  • bservations

3

slide-15
SLIDE 15

Specification of the problem

Data Multidimensional time series yobs

k

(1 ≤ k ≤ K) observed from an underlying dynamical process: yobs

k

= Hk(xk) + ϵobs

k

  • Hk is the known observation operator: RNx → Rp
  • ϵobs

k

is a noise Underlying dynamical model: dx dt x Resolvent: xk

1

xk

tk

1

tk

x dt

4

slide-16
SLIDE 16

Specification of the problem

Data Multidimensional time series yobs

k

(1 ≤ k ≤ K) observed from an underlying dynamical process: yobs

k

= Hk(xk) + ϵobs

k

  • Hk is the known observation operator: RNx → Rp
  • ϵobs

k

is a noise Underlying dynamical model: dx dt = φ(x) Resolvent: xk

1

xk

tk

1

tk

x dt

4

slide-17
SLIDE 17

Specification of the problem

Data Multidimensional time series yobs

k

(1 ≤ k ≤ K) observed from an underlying dynamical process: yobs

k

= Hk(xk) + ϵobs

k

  • Hk is the known observation operator: RNx → Rp
  • ϵobs

k

is a noise Underlying dynamical model: dx dt = φ(x) Resolvent: xk+1 = xk +

tk+1

tk

φ(x) dt,

4

slide-18
SLIDE 18

Two complementary goals

  • 1. Inferring the ODE using only DA algorithm [Bocquet et al., 2019]:

dx dt = φA(x), φA(x) = Ar(x), where r(x) ∈ RNp is specified and A ∈ RNx×Np is to be determined.

  • 2. Emulation of the resolvent combining DA and ML [Brajard et al., 2019]:

xk

1 W xk m k

where

W is a neural network parametrized by W and m k is a

stochastic noise.

5

slide-19
SLIDE 19

Two complementary goals

  • 1. Inferring the ODE using only DA algorithm [Bocquet et al., 2019]:

dx dt = φA(x), φA(x) = Ar(x), where r(x) ∈ RNp is specified and A ∈ RNx×Np is to be determined.

  • 2. Emulation of the resolvent combining DA and ML [Brajard et al., 2019]:

xk+1 = GW(xk) + ϵm

k ,

where GW is a neural network parametrized by W and ϵm

k is a

stochastic noise.

5

slide-20
SLIDE 20

First goal: Inferring the ODE using DA

slide-21
SLIDE 21

First goal: ODE representation for the surrogate model

Ordinary differential equations (ODEs) representation of the surrogate dynamics dx dt = φA(x), φA(x) = Ar(x), where

  • A ∈ RNx×Np is a matrix of coefficients to be determined.
  • r(x) is a vector of nonlinear regressors of size Np. For instance,

for one-dimensional spatial systems and up to bilinear order: r(x) = [ 1, {xn}0≤n<Nx , {xnxm}0≤n≤m<Nx ] . Np

Nx 1 2 1 2 Nx

1 Nx 2 . Intractable in high-dimension! Typically, Nx 106

9 . 6

slide-22
SLIDE 22

First goal: ODE representation for the surrogate model

Ordinary differential equations (ODEs) representation of the surrogate dynamics dx dt = φA(x), φA(x) = Ar(x), where

  • A ∈ RNx×Np is a matrix of coefficients to be determined.
  • r(x) is a vector of nonlinear regressors of size Np. For instance,

for one-dimensional spatial systems and up to bilinear order: r(x) = [ 1, {xn}0≤n<Nx , {xnxm}0≤n≤m<Nx ] . Np = (Nx+1

2

) = 1

2(Nx + 1)(Nx + 2).

− → Intractable in high-dimension! Typically, Nx = O(106−9).

6

slide-23
SLIDE 23

Reducing the number of regressors

Locality Physical locality of the physics: all multivariate monomials in the ODEs have variables xn that belong to a stencil, i.e. a local arrangement of grid points around a given node. In 1D and with a stencil of size 2L + 1, the size of the dense A is Nx × Na where Na =

2L+2

l=L+1

l = 3 2(L + 1)(L + 2). Homogeneity Moreover, we can additionally assume translational invariance. In that case A becomes a vector of size Na.

7

slide-24
SLIDE 24

Reducing the number of regressors

Locality Physical locality of the physics: all multivariate monomials in the ODEs have variables xn that belong to a stencil, i.e. a local arrangement of grid points around a given node. In 1D and with a stencil of size 2L + 1, the size of the dense A is Nx × Na where Na =

2L+2

l=L+1

l = 3 2(L + 1)(L + 2). Homogeneity Moreover, we can additionally assume translational invariance. In that case A becomes a vector of size Na.

7

slide-25
SLIDE 25

Bayesian analysis of the problem

Bayesian view on state and model estimation: p(A, x0:K|y0:K) = p(y0:K|x0:K, A)p(x0:K|A)p(A) p(y0:K) . Data assimilation cost function assuming Gaussian error statistics and Markovian dynamics: A x0 K 1 2

K k

yk Hk xk

2 R

1 k

1 2

K k 1

xk FA xk

1 2 Q

1 k

p x0 A where FA is the resolvent of the model between tk and tk

t.

Allows to handle partial and noisy observations. Typical machine learning cost function with Hk Ik in the limit Rk 0: A 1 2

K k 1

yk FA yk

1 2 Q

1 k

p y0 A

8

slide-26
SLIDE 26

Bayesian analysis of the problem

Bayesian view on state and model estimation: p(A, x0:K|y0:K) = p(y0:K|x0:K, A)p(x0:K|A)p(A) p(y0:K) . Data assimilation cost function assuming Gaussian error statistics and Markovian dynamics: J (A, x0:K) = 1 2

K

k=0

∥yk − Hk(xk)∥2

R−1

k

+ 1 2

K

k=1

∥xk − FA(xk−1)∥2

Q−1

k

− ln p(x0, A), where FA is the resolvent of the model between tk and tk + ∆t. − → Allows to handle partial and noisy observations. Typical machine learning cost function with Hk Ik in the limit Rk 0: A 1 2

K k 1

yk FA yk

1 2 Q

1 k

p y0 A

8

slide-27
SLIDE 27

Bayesian analysis of the problem

Bayesian view on state and model estimation: p(A, x0:K|y0:K) = p(y0:K|x0:K, A)p(x0:K|A)p(A) p(y0:K) . Data assimilation cost function assuming Gaussian error statistics and Markovian dynamics: J (A, x0:K) = 1 2

K

k=0

∥yk − Hk(xk)∥2

R−1

k

+ 1 2

K

k=1

∥xk − FA(xk−1)∥2

Q−1

k

− ln p(x0, A), where FA is the resolvent of the model between tk and tk + ∆t. − → Allows to handle partial and noisy observations. Typical machine learning cost function with Hk = Ik in the limit Rk − → 0: J (A) ≈ 1 2

K

k=1

∥yk − FA(yk−1)∥2

Q−1

k

− ln p(y0, A).

8

slide-28
SLIDE 28

Experiment setup

δtr δta δtf ∆t t0 tK t0 tK T + Tf T y0 yK generating physical states learning step forecast step yk yk+1

Illustration using a Lorenz 96 model:

  • Size of the state Nx

40

  • Integration scheme: 4th order RK (RK4)
  • Integration time step:

tr t 0 05

  • integration length: K

50

9

slide-29
SLIDE 29

Experiment setup

δtr δta δtf ∆t t0 tK t0 tK T + Tf T y0 yK generating physical states learning step forecast step yk yk+1

Illustration using a Lorenz 96 model:

  • Size of the state Nx = 40
  • Integration scheme: 4th order RK (RK4)
  • Integration time step: δtr = ∆t = 0.05
  • integration length: K = 50

9

slide-30
SLIDE 30

Case studies

Model scheme time step Observation noise Identifiable RK4 δta = ∆t = 0.05 Non identifiable RK2 δta = 0.05/Nc Identifiable RK4 δta = ∆t = 0.05 σy > 0 Identifiable model:

  • The true model φ(x) is included in the candidates φA(x),
  • The integration scheme and the step time used for generating

the observations is the same as the one used for the surrogate model.

10

slide-31
SLIDE 31

Case 1: Identifiable model and perfect observations

Comparison of the ODE coefficients ∥Aa − Ar∥∞ ∼ 10−13, where Ar are the coefficients of the reference equation (truth) and Aa are the coefficients of the surrogate ODE. Almost perfect reconstruction to the machine precision.

11

slide-32
SLIDE 32

Case 2: Non-identifiable model and perfect observations

Surrogate model based on an RK2 scheme, δta = ∆t/Nc. Analysis of the modelling depth as a function of Nc.

2 4 6 8 10 12 14

Forecast lead time (Lyapunov time unit)

1 2 3 4 5

Average RMSE

Nc =1 Nc =2 Nc =3 Nc =4 Nc =5

12

slide-33
SLIDE 33

Case 3: Identifiable model and imperfect observations

ODE coefficients

2−9 2−8 2−7 2−6 2−5 2−4 2−3 2−2 2−1

σy

10−3.0 10−2.5 10−2.0 10−1.5 10−1.0 10−0.5 100.0 100.5 101.0

Gap between the surrogate and reference dynamics

Aa − Ar2 Aa − Ar∞

13

slide-34
SLIDE 34

Case 3: Identifiable model and imperfect observations

Forecast skill

2 4 6 8 10 12 14 1 2 3 4 5

Average RMSE

σy = 2−9 σy = 2−8 σy = 2−7 σy = 2−6 σy = 2−5 σy = 2−4 σy = 2−3 σy = 2−2 σy = 2−1

14

slide-35
SLIDE 35

Remarks: connections between Data assimilation and machine learning

Data Assimilation Machine Learning Dynamical system Residual deep neural network Parametrized forecasting model Layer of a neural network Optimization Training Adjoint modelling Backpropagation Locality assumption Convolutional layers

15

slide-36
SLIDE 36

Remarks: connections between Data assimilation and machine learning

Data Assimilation Machine Learning Dynamical system Residual deep neural network Parametrized forecasting model Layer of a neural network Optimization Training Adjoint modelling Backpropagation Locality assumption Convolutional layers

15

slide-37
SLIDE 37

Remarks: connections between Data assimilation and machine learning

Data Assimilation Machine Learning Dynamical system Residual deep neural network Parametrized forecasting model Layer of a neural network Optimization Training Adjoint modelling Backpropagation Locality assumption Convolutional layers

15

slide-38
SLIDE 38

Remarks: connections between Data assimilation and machine learning

Data Assimilation Machine Learning Dynamical system Residual deep neural network Parametrized forecasting model Layer of a neural network Optimization Training Adjoint modelling Backpropagation Locality assumption Convolutional layers

15

slide-39
SLIDE 39

Remarks: connections between Data assimilation and machine learning

Data Assimilation Machine Learning Dynamical system Residual deep neural network Parametrized forecasting model Layer of a neural network Optimization Training Adjoint modelling Backpropagation Locality assumption Convolutional layers

15

slide-40
SLIDE 40

Remarks: connections between Data assimilation and machine learning

Data Assimilation Machine Learning Dynamical system Residual deep neural network Parametrized forecasting model Layer of a neural network Optimization Training Adjoint modelling Backpropagation Locality assumption Convolutional layers

15

slide-41
SLIDE 41

Second goal: Emulating a model by combining DA and ML

slide-42
SLIDE 42

General remarks

What is Data Assimilation good at? Given a numerical model, some observations and assumptions on uncertainties:

  • Estimate the state of a system in an objective way,
  • Estimate the uncertainty of the state.

What is Machine Learning good at? Given a “good enough” dataset:

  • Retrieve some hidden relationships in the dataset.

Idea Combining both approaches to develop accurate emulator of numerical models.

16

slide-43
SLIDE 43

General remarks

What is Data Assimilation good at? Given a numerical model, some observations and assumptions on uncertainties:

  • Estimate the state of a system in an objective way,
  • Estimate the uncertainty of the state.

What is Machine Learning good at? Given a “good enough” dataset:

  • Retrieve some hidden relationships in the dataset.

Idea Combining both approaches to develop accurate emulator of numerical models.

16

slide-44
SLIDE 44

General remarks

What is Data Assimilation good at? Given a numerical model, some observations and assumptions on uncertainties:

  • Estimate the state of a system in an objective way,
  • Estimate the uncertainty of the state.

What is Machine Learning good at? Given a “good enough” dataset:

  • Retrieve some hidden relationships in the dataset.

Idea Combining both approaches to develop accurate emulator of numerical models.

16

slide-45
SLIDE 45

Proposed algorithm

  • Observations: yobs

k

= Hk(xk) + ϵobs

k

  • The neural net: xk+1 = GW(xk) + ϵm

k = xk + tk+1

tk

φ(x) dt Initialization: W Fix W, Estimation of xa

1 K using yobs

DA step Fix xa

1 K, Estimation of W

ML step Cycle Stop if converged Training of a neural net Finite-Size Ensemble Kalman Filter

17

slide-46
SLIDE 46

Proposed algorithm

  • Observations: yobs

k

= Hk(xk) + ϵobs

k

  • The neural net: xk+1 = GW(xk) + ϵm

k = xk + tk+1

tk

φ(x) dt Initialization: W Fix W, Estimation of xa

1 K using yobs

DA step Fix xa

1 K, Estimation of W

ML step Cycle Stop if converged Training of a neural net Finite-Size Ensemble Kalman Filter

17

slide-47
SLIDE 47

Proposed algorithm

  • Observations: yobs

k

= Hk(xk) + ϵobs

k

  • The neural net: xk+1 = GW(xk) + ϵm

k = xk + tk+1

tk

φ(x) dt Initialization: W Fix W, Estimation of xa

1:K using yobs

DA step Fix xa

1 K, Estimation of W

ML step Cycle Stop if converged Training of a neural net Finite-Size Ensemble Kalman Filter

17

slide-48
SLIDE 48

Proposed algorithm

  • Observations: yobs

k

= Hk(xk) + ϵobs

k

  • The neural net: xk+1 = GW(xk) + ϵm

k = xk + tk+1

tk

φ(x) dt Initialization: W Fix W, Estimation of xa

1:K using yobs

DA step Fix xa

1:K, Estimation of W

ML step Cycle Stop if converged Training of a neural net Finite-Size Ensemble Kalman Filter

17

slide-49
SLIDE 49

Proposed algorithm

  • Observations: yobs

k

= Hk(xk) + ϵobs

k

  • The neural net: xk+1 = GW(xk) + ϵm

k = xk + tk+1

tk

φ(x) dt Initialization: W Fix W, Estimation of xa

1:K using yobs

DA step Fix xa

1:K, Estimation of W

ML step Cycle Stop if converged Training of a neural net Finite-Size Ensemble Kalman Filter

17

slide-50
SLIDE 50

Proposed algorithm

  • Observations: yobs

k

= Hk(xk) + ϵobs

k

  • The neural net: xk+1 = GW(xk) + ϵm

k = xk + tk+1

tk

φ(x) dt Initialization: W Fix W, Estimation of xa

1:K using yobs

DA step Fix xa

1:K, Estimation of W

ML step Cycle Stop if converged Training of a neural net Finite-Size Ensemble Kalman Filter

17

slide-51
SLIDE 51

Proposed algorithm

  • Observations: yobs

k

= Hk(xk) + ϵobs

k

  • The neural net: xk+1 = GW(xk) + ϵm

k = xk + tk+1

tk

φ(x) dt Initialization: W Fix W, Estimation of xa

1:K using yobs

DA step Fix xa

1:K, Estimation of W

ML step Cycle Stop if converged Training of a neural net Finite-Size Ensemble Kalman Filter

17

slide-52
SLIDE 52

Numerical experiment: Lorenz 96 model

A simulation is performed over K = 40, 000 time steps: xref

0:K

yobs

k

= Ht(xref

k ) + ϵobs k

; yobs

t

∈ Rp

  • Hk is defined at each time

step by randomly sample p=20 observations (50% of the state space).

  • ϵobs

k

is generated using a Gaussian law of mean 0 and standard deviation 1.

18

slide-53
SLIDE 53

Neural Network setup

xk Batch Norm CNN2a CNN2b × CNN2c CNN3 CNN4 + GW (xk) Layer number of unit filter size number of weights 1 (batchnorm) 2 2 (bilinear) 24 × 3 5 144 × 3 3 (convolutive) 37 5 8917 4 (linear) 1 1 38 Residual bi-linear convolutive neural network (9391 weights), compared with Na 18 in case of ODE parametrization.

19

slide-54
SLIDE 54

Neural Network setup

xk Batch Norm CNN2a CNN2b × CNN2c CNN3 CNN4 + GW (xk) Layer number of unit filter size number of weights 1 (batchnorm) 2 2 (bilinear) 24 × 3 5 144 × 3 3 (convolutive) 37 5 8917 4 (linear) 1 1 38 Residual bi-linear convolutive neural network (9391 weights), compared with Na = 18 in case of ODE parametrization.

19

slide-55
SLIDE 55

Evaluation

  • Interpolating the observations:

Score: RMSE-a (Root-mean square error of the analysis)

  • Forecasting skill

Score: RMSE-f (Root-mean square error of the forecast as a function of leading time)

  • Reproducing the long-term dynamics

Score: Lyapunov exponents and PSD (Power Spectral Density) compared with the true model.

20

slide-56
SLIDE 56

Evaluation

  • Interpolating the observations:

Score: RMSE-a (Root-mean square error of the analysis)

  • Forecasting skill

Score: RMSE-f (Root-mean square error of the forecast as a function of leading time)

  • Reproducing the long-term dynamics

Score: Lyapunov exponents and PSD (Power Spectral Density) compared with the true model.

20

slide-57
SLIDE 57

Evaluation

  • Interpolating the observations:

Score: RMSE-a (Root-mean square error of the analysis)

  • Forecasting skill

Score: RMSE-f (Root-mean square error of the forecast as a function of leading time)

  • Reproducing the long-term dynamics

Score: Lyapunov exponents and PSD (Power Spectral Density) compared with the true model.

20

slide-58
SLIDE 58

Evaluation

  • Interpolating the observations:

Score: RMSE-a (Root-mean square error of the analysis)

  • Forecasting skill

Score: RMSE-f (Root-mean square error of the forecast as a function of leading time)

  • Reproducing the long-term dynamics

Score: Lyapunov exponents and PSD (Power Spectral Density) compared with the true model.

20

slide-59
SLIDE 59

Evaluation

  • Interpolating the observations:

Score: RMSE-a (Root-mean square error of the analysis)

  • Forecasting skill

Score: RMSE-f (Root-mean square error of the forecast as a function of leading time)

  • Reproducing the long-term dynamics

Score: Lyapunov exponents and PSD (Power Spectral Density) compared with the true model.

20

slide-60
SLIDE 60

Evaluation

  • Interpolating the observations:

Score: RMSE-a (Root-mean square error of the analysis)

  • Forecasting skill

Score: RMSE-f (Root-mean square error of the forecast as a function of leading time)

  • Reproducing the long-term dynamics

Score: Lyapunov exponents and PSD (Power Spectral Density) compared with the true model.

20

slide-61
SLIDE 61

Convergence of the algorithm

21

slide-62
SLIDE 62

Convergence of the algorithm

21

slide-63
SLIDE 63

Interpolation

RMSE (obs)= 1 RMSE-a 0 8 Method RMSE-a Lower bound Quadratic interpolation 2.32 DA with surrogate model 0.80 Upper bound DA with true model 0.34

22

slide-64
SLIDE 64

Interpolation

RMSE (obs)= 1 RMSE-a= 0.8 Method RMSE-a Lower bound Quadratic interpolation 2.32 DA with surrogate model 0.80 Upper bound DA with true model 0.34

22

slide-65
SLIDE 65

Interpolation

RMSE (obs)= 1 RMSE-a= 0.8 Method RMSE-a Lower bound Quadratic interpolation 2.32 DA with surrogate model 0.80 Upper bound DA with true model 0.34

22

slide-66
SLIDE 66

Forecast skill

  • Lower bound: Neural Net trained with observation interpolated

using quadratic interpolation (no data assimilation).

  • Upper bound: Neural Net trained with “perfect” observations

(complete, no noise).

23

slide-67
SLIDE 67

Forecast skill

  • Lower bound: Neural Net trained with observation interpolated

using quadratic interpolation (no data assimilation).

  • Upper bound: Neural Net trained with “perfect” observations

(complete, no noise).

23

slide-68
SLIDE 68

Sensitivity to noise and density of observations

Sensititvity to the density of

  • bservations

RMSE-f(t0 + δt)

  • bservation noise: σobs = 1

Sensititvity to the noise of

  • bservations

RMSE-f(t0 + δt) density of observations: 50%

24

slide-69
SLIDE 69

Reconstruction of the long-term dynamics

Power spectral density Lyapunov exponents

  • Lower bound: Neural Net trained with observation interpolated

using quadratic interpolation (no data assimilation).

  • Upper bound: True model

25

slide-70
SLIDE 70

Reconstruction of the long-term dynamics

Power spectral density Lyapunov exponents

  • Lower bound: Neural Net trained with observation interpolated

using quadratic interpolation (no data assimilation).

  • Upper bound: True model

25

slide-71
SLIDE 71

Conclusion

Emulate an numerical model given sparse and noisy observations

  • Bayesian data assimilation for state and model estimation:
  • equivalent to a machine learning approach,
  • makes use of locality and homogeneity to reduce the dimension of

the model parameters.

  • Combined data assimilation / machine learning:
  • emulate the resolvent of the model,
  • training of the neural nets are performed on state estimated from

data assimilation.

Properties of the neural net surrogate model

  • Interpolation of the observations: denoising of the
  • bservations and interpolation
  • Predictability skills: sensitive to model noise, and to
  • bservation density below 50%
  • Replication of the long-term dynamics properties

26

slide-72
SLIDE 72

Conclusion

Emulate an numerical model given sparse and noisy observations

  • Bayesian data assimilation for state and model estimation:
  • equivalent to a machine learning approach,
  • makes use of locality and homogeneity to reduce the dimension of

the model parameters.

  • Combined data assimilation / machine learning:
  • emulate the resolvent of the model,
  • training of the neural nets are performed on state estimated from

data assimilation.

Properties of the neural net surrogate model

  • Interpolation of the observations: denoising of the
  • bservations and interpolation
  • Predictability skills: sensitive to model noise, and to
  • bservation density below 50%
  • Replication of the long-term dynamics properties

26

slide-73
SLIDE 73

Conclusion

Emulate an numerical model given sparse and noisy observations

  • Bayesian data assimilation for state and model estimation:
  • equivalent to a machine learning approach,
  • makes use of locality and homogeneity to reduce the dimension of

the model parameters.

  • Combined data assimilation / machine learning:
  • emulate the resolvent of the model,
  • training of the neural nets are performed on state estimated from

data assimilation.

Properties of the neural net surrogate model

  • Interpolation of the observations: denoising of the
  • bservations and interpolation
  • Predictability skills: sensitive to model noise, and to
  • bservation density below 50%
  • Replication of the long-term dynamics properties

26

slide-74
SLIDE 74

Marc Bocquet, Julien Brajard, Alberto Carrassi, and Laurent Bertino. Data assimilation as a deep learning tool to infer ODE representations of dynamical models. Nonlinear Processes in Geophysics Discussions, pages 1–29, 2019. URL: https://doi.org/10.5194/npg-2019-7, doi:10.5194/npg-2019-7.

  • J. Brajard, A. Carrassi, M. Bocquet, and L. Bertino.

Combining data assimilation and machine learning to emulate a dynamical model from sparse and noisy observations: a case study with the lorenz 96 model. Geoscientific Model Development Discussions, 2019:1–21, 2019. URL: https://www.geosci-model-dev-discuss.net/gmd-2019-136/, doi:10.5194/gmd-2019-136.

julien.brajard@sorbonne-universite.fr, julien.brajard@nersc.no Call for papers

2-4 October 2019

https://sites.google.com/view/climateinformatics2019

27

slide-75
SLIDE 75

Marc Bocquet, Julien Brajard, Alberto Carrassi, and Laurent Bertino. Data assimilation as a deep learning tool to infer ODE representations of dynamical models. Nonlinear Processes in Geophysics Discussions, pages 1–29, 2019. URL: https://doi.org/10.5194/npg-2019-7, doi:10.5194/npg-2019-7.

  • J. Brajard, A. Carrassi, M. Bocquet, and L. Bertino.

Combining data assimilation and machine learning to emulate a dynamical model from sparse and noisy observations: a case study with the lorenz 96 model. Geoscientific Model Development Discussions, 2019:1–21, 2019. URL: https://www.geosci-model-dev-discuss.net/gmd-2019-136/, doi:10.5194/gmd-2019-136.

julien.brajard@sorbonne-universite.fr, julien.brajard@nersc.no Call for papers

2-4 October 2019

https://sites.google.com/view/climateinformatics2019

27