 
              Combining Data Assimilation and Machine Learning to emulate a numerical model Julien Brajard, Alberto Carrassi, Marc Bocquet, Laurent Bertino 05 June 2019 NERSC, LOCEAN-IPSL-Sorbonne Université, CEREA 1
Motivation Chloropyhll-a (Model) July 26, 2018 TOPAZ4-ECOSMO forecast • Unresolved process • Unknown parameters Chloropyhll-a (Observation) July 26, 2018 MODIS Aqua • Sparse • Noisy 2
Motivation Chloropyhll-a (Model) July 26, 2018 TOPAZ4-ECOSMO forecast • Unresolved process • Unknown parameters Chloropyhll-a (Observation) July 26, 2018 MODIS Aqua • Sparse • Noisy 2
Motivation Chloropyhll-a (Model) July 26, 2018 TOPAZ4-ECOSMO forecast • Unresolved process • Unknown parameters Chloropyhll-a (Observation) July 26, 2018 MODIS Aqua • Sparse • Noisy 2
Typology of problems and approaches State State + Model pa- rameters State + co- efficients of the ODE State + Emulator Emulator Data Assimilation DA+ML Machine Learning 3
Typology of problems and approaches Emulator Machine Learning DA+ML Data Assimilation State (sparse, noisy) Imperfect observations 3 Emulator of the ODE efficients State + State + co- Model pa- rameters State + l e d o m t c e f r e p
Typology of problems and approaches Emulator Machine Learning DA+ML Data Assimilation State (sparse, noisy) Imperfect observations 3 Emulator of the ODE efficients State + State + co- Model pa- rameters State + l e d o m t c e f r e p
Typology of problems and approaches Imperfect observations Machine Learning DA+ML Data Assimilation State observations Perfect (sparse, noisy) 3 Emulator Emulator State + of the ODE efficients State + Model pa- rameters State + co- (no ODE) l e d o m l e d t c o e m f r e o p n
Typology of problems and approaches Imperfect observations Machine Learning DA+ML Data Assimilation State observations Perfect (sparse, noisy) 3 Emulator Emulator State + of the ODE efficients State + Model pa- rameters State + co- (no ODE) l e d o m l e d t c o e m f r e o p n
Typology of problems and approaches Imperfect observations Machine Learning DA+ML Data Assimilation State observations Perfect (sparse, noisy) 3 Emulator State + co- State + of the ODE efficients State + Model pa- rameters Emulator (no ODE) l e d l e o d m o m t l c e e d t f c o r e e m f p r e m o p n i
Typology of problems and approaches Imperfect observations Machine Learning DA+ML Data Assimilation State observations Perfect (sparse, noisy) 3 Emulator State + co- State + of the ODE efficients State + Model pa- rameters Emulator (no ODE) l e d (general ODE form) l e o l e d m d o o m t l c m e e d t f c o o r e e n m f p r e m o p n i
Typology of problems and approaches Imperfect observations Machine Learning DA+ML Data Assimilation State observations Perfect (sparse, noisy) 3 Emulator State + co- State + of the ODE efficients State + Model pa- rameters Emulator (no ODE) l e d (general ODE form) l e o l e d m d o o m t l c m e e d t f c o o r e e n m f p r e m o p n i
Typology of problems and approaches Imperfect observations This talk Machine Learning DA+ML Data Assimilation State observations Perfect (sparse, noisy) 3 Emulator rameters efficients State + co- Emulator Model pa- State + of the ODE State + (no ODE) l e d (general ODE form) l e o l e d m d o o m t l m c e e d t f c o o r e e n m f p r e m o p n i
Our Objective: Producing an accurate and reliable emulator of a numerical model given sparse and noisy observations 3
Specification of the problem Underlying dynamical model: x d t t k 1 t k x k 1 x k Resolvent: x d t d x is a noise Data k k k y obs underlying dynamical process: k Multidimensional time series y obs 4 (1 ≤ k ≤ K ) observed from an = H k ( x k ) + ϵ obs • H k is the known observation operator: R N x → R p • ϵ obs
Specification of the problem Data x d t t k 1 t k x k 1 x k Resolvent: d x Underlying dynamical model: is a noise k k k y obs underlying dynamical process: k Multidimensional time series y obs 4 (1 ≤ k ≤ K ) observed from an = H k ( x k ) + ϵ obs • H k is the known observation operator: R N x → R p • ϵ obs d t = φ ( x )
Specification of the problem k t k Resolvent: d x Underlying dynamical model: is a noise Data k k y obs underlying dynamical process: k Multidimensional time series y obs 4 (1 ≤ k ≤ K ) observed from an = H k ( x k ) + ϵ obs • H k is the known observation operator: R N x → R p • ϵ obs d t = φ ( x ) t k + 1 ∫ x k + 1 = x k + φ ( x ) d t ,
2. Emulation of the resolvent combining DA and ML [Brajard et al., 2019] : W x k W is a neural network parametrized by W and k is a m stochastic noise. m where k Two complementary goals 1. Inferring the ODE using only DA algorithm [Bocquet et al., 2019] : 1 x k d x 5 d t = φ A ( x ) , φ A ( x ) = Ar ( x ) , where r ( x ) ∈ R N p is specified and A ∈ R N x × N p is to be determined.
Two complementary goals 1. Inferring the ODE using only DA algorithm [Bocquet et al., 2019] : d x stochastic noise. 5 d t = φ A ( x ) , φ A ( x ) = Ar ( x ) , where r ( x ) ∈ R N p is specified and A ∈ R N x × N p is to be determined. 2. Emulation of the resolvent combining DA and ML [Brajard et al., 2019] : x k + 1 = G W ( x k ) + ϵ m k , where G W is a neural network parametrized by W and ϵ m k is a
First goal: Inferring the ODE using DA
2 N x 9 . First goal: ODE representation for the surrogate model N p 10 6 Intractable in high-dimension! Typically, N x 2 . 1 N x 1 2 1 N x 6 Ordinary differential equations (ODEs) representation of the for one-dimensional spatial systems and up to bilinear order: where d x surrogate dynamics d t = φ A ( x ) , φ A ( x ) = Ar ( x ) , • A ∈ R N x × N p is a matrix of coefficients to be determined. • r ( x ) is a vector of nonlinear regressors of size N p . For instance, [ ] r ( x ) = 1 , { x n } 0 ≤ n < N x , { x n x m } 0 ≤ n ≤ m < N x .
First goal: ODE representation for the surrogate model where 2 Ordinary differential equations (ODEs) representation of the for one-dimensional spatial systems and up to bilinear order: 6 surrogate dynamics d x d t = φ A ( x ) , φ A ( x ) = Ar ( x ) , • A ∈ R N x × N p is a matrix of coefficients to be determined. • r ( x ) is a vector of nonlinear regressors of size N p . For instance, [ ] r ( x ) = 1 , { x n } 0 ≤ n < N x , { x n x m } 0 ≤ n ≤ m < N x . ( N x + 1 ) N p = = 1 2 ( N x + 1 )( N x + 2 ) . → Intractable in high-dimension! Typically, N x = O ( 10 6 − 9 ) . −
Moreover, we can additionally assume translational invariance. In Reducing the number of regressors where that case A becomes a vector of size N a . Homogeneity Locality 7 arrangement of grid points around a given node. Physical locality of the physics: all multivariate monomials in the ODEs have variables x n that belong to a stencil, i.e. a local In 1D and with a stencil of size 2 L + 1, the size of the dense A is 2 L + 2 ∑ N x × N a N a = l = 3 2 ( L + 1 )( L + 2 ) . l = L + 1
Reducing the number of regressors where that case A becomes a vector of size N a . Homogeneity Locality 7 Physical locality of the physics: all multivariate monomials in the arrangement of grid points around a given node. ODEs have variables x n that belong to a stencil, i.e. a local In 1D and with a stencil of size 2 L + 1, the size of the dense A is 2 L + 2 ∑ N x × N a N a = l = 3 2 ( L + 1 )( L + 2 ) . l = L + 1 Moreover, we can additionally assume translational invariance. In
H k x k F A x k p x 0 A where F A is the resolvent of the model between t k and t k Allows to handle partial and noisy observations. Typical machine learning cost function with H k I k in the limit R k F A y k p y 0 A Bayesian analysis of the problem k t . 0 : A 2 1 Q K k 1 y k 1 2 Q 1 k 1 1 2 Bayesian view on state and model estimation: Data assimilation cost function assuming Gaussian error statistics and Markovian dynamics: A x 0 K 1 2 K k 0 y k 2 R 1 k 1 2 K k 1 x k 8 p ( A , x 0 : K | y 0 : K ) = p ( y 0 : K | x 0 : K , A ) p ( x 0 : K | A ) p ( A ) . p ( y 0 : K )
Typical machine learning cost function with H k I k in the limit R k F A y k p y 0 A k 0 : A 1 2 K Bayesian analysis of the problem 1 Bayesian view on state and model estimation: 1 2 Q 1 k y k k 8 K Data assimilation cost function assuming Gaussian error statistics and Markovian dynamics: K 2 k 2 p ( A , x 0 : K | y 0 : K ) = p ( y 0 : K | x 0 : K , A ) p ( x 0 : K | A ) p ( A ) . p ( y 0 : K ) ∑ ∑ J ( A , x 0 : K ) = 1 ∥ y k − H k ( x k ) ∥ 2 + 1 ∥ x k − F A ( x k − 1 ) ∥ 2 − ln p ( x 0 , A ) , R − 1 Q − 1 k = 0 k = 1 where F A is the resolvent of the model between t k and t k + ∆ t . − → Allows to handle partial and noisy observations.
Recommend
More recommend