Introduction to Data Assimilation Ma elle Nodet - - PowerPoint PPT Presentation

introduction to data assimilation
SMART_READER_LITE
LIVE PREVIEW

Introduction to Data Assimilation Ma elle Nodet - - PowerPoint PPT Presentation

Introduction to Data Assimilation Ma elle Nodet team.inria.fr/moise/maelle Universit e de Grenoble, INRIA, LJK GdT Couplage, Montpellier 6 mars 2012 Intoduction What is data assimilation? What is data assimilation? Combine at best


slide-1
SLIDE 1

Introduction to Data Assimilation

Ma¨ elle Nodet

team.inria.fr/moise/maelle Universit´ e de Grenoble, INRIA, LJK

GdT Couplage, Montpellier 6 mars 2012

slide-2
SLIDE 2

Intoduction What is data assimilation?

What is data assimilation?

Combine at best different sources of information to estimate the state of a system: model equations

  • bservations, data

background, a priori information statistics

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 2 / 57

slide-3
SLIDE 3

Intoduction What is data assimilation?

Data only

data reference (1 every 25 gripoint)

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 3 / 57

slide-4
SLIDE 4

Intoduction What is data assimilation?

Model only

reference model (after 4 months)

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 4 / 57

slide-5
SLIDE 5

Intoduction What is data assimilation?

Model-Data coupling

reference after data assimilation

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 5 / 57

slide-6
SLIDE 6

Intoduction What is data assimilation?

What is data assimilation for?

Historically: meteorology. Later, oceanography. Today, many other fields glaciology, seismology, nuclear fusion, medicine, agronomy, etc.

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 6 / 57

slide-7
SLIDE 7

Intoduction What is data assimilation?

What is data assimilation for?

Historically: initial state estimation, for weather forecasting. Today, many other applications: initial conditions for predictions, calibration and validation,

  • bserving system design, monitoring and assessment,

reanalysis, better understanding (model errors, data errors, physical process interactions, parameters, etc), etc.

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 7 / 57

slide-8
SLIDE 8

Intoduction Problem position

Example from ocean forecasting

Short term ocean forecasting (a few weeks) of currents, temperature, salinity, for fisheries, coastal economy, sailing... We aim at producing an estimate xa of the true state xt = (u, v, T, S) of the

  • cean at initial time, to initialize forecasts.

We are given: a background estimate xb = (ub, v b, T b, Sb), which is either a previous forecast or comes from climatology, partial observations yo = H(xt) + ǫo, distributed over a given time window, e.g. temperatures from buoys, sea surface elevation from satellites, currents from moorings, . . . Observation operator H contains the dynamical model mapping the initial state

  • f the ocean to the actual temperature, currents, sea surface height at given

points in space and time.

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 8 / 57

slide-9
SLIDE 9

Intoduction Problem position

Example from glaciology

Short term ice dynamics forecasting (100 years), to estimate Antarctica and Greenland contribution to sea level change. We aim at producing an estimate xa of the true input parameters xt = β of the basal drag coefficient (function of space, constant over time) at the bottom of the ice cap. We are given: a background estimate xb = βb, which is roughly inferred from surface velocities, partial observations yo = H(xt) + ǫo, e.g. surface velocities, ice surface elevation, approximate bedrock topography . . . Observation operator H contains the dynamical model mapping the basal drag of the ice cap to the surface variables at given points in space and time.

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 9 / 57

slide-10
SLIDE 10

Intoduction Problem position

Outline

1

Stochastic data assimilation Best linear unbiased estimator (BLUE) Kalman filter algorithm

2

Variational Data Assimilation Principle of variational methods Gradient-based optimization Variational algorithms

3

Implementation issues Non linearities High dimensional problems Gradient computation: adjoint method

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 10 / 57

slide-11
SLIDE 11

Stochastic data assimilation Best linear unbiased estimator (BLUE)

Errors statistics

Mean: E(x) =< x > scalar, E(x) = (E(x1), E(x2), ..., E(xn)) vector-valued Variance, covariance (x, y scalar): Var(x) = E((x − E(x))2), Cov(x, y) = E((x − E(x))(y − E(y))) We say that errors are: unbiased if E(ǫ) = 0 ; uncorrelated if E(ǫ1ǫT

2 ) = 0 ;

non trivial if Cov(ǫ) is positive-definite ;

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 13 / 57

slide-12
SLIDE 12

Stochastic data assimilation Best linear unbiased estimator (BLUE)

Covariance matrix

Covariance matrix (x vector-valued): Cov(x) = E((x − E(x))(x − E(x))T) (Cov(x))i,j = Cov(xi, xj) = E((xi − E(xi))(xj − E(xj))) E.g. for x = (x1, x2, x3): Cov(x) =   Var(x1) Cov(x1, x2) Cov(x1, x3) Cov(x1, x2) Var(x2) Cov(x2x3) Cov(x1, x3) Cov(x2, x3) Var(x3)  

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 14 / 57

slide-13
SLIDE 13

Stochastic data assimilation Best linear unbiased estimator (BLUE)

Notations

State x state vector or input parameters xt true state (unknown) xb background state (a priori information), background error ǫb = xb − xt, covariance matrix B xa analyzed state (result of the assimilation process), analysis error ǫa = xa − xt, covariance matrix A Observations

  • bservation vector yo
  • bservation operator H, mapping state space to observation space:

yo = H(xt) + ǫo

  • bservation error ǫo, covariance matrix R

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 15 / 57

slide-14
SLIDE 14

Stochastic data assimilation Best linear unbiased estimator (BLUE)

Problem position: what we have

We aim at producing an estimate xa of the true input parameters xt of the system. We are given: a background estimate xb, whose error ǫb are assumed unbiased and non trivial, with covariance matrix B given, partial observations yo = H(xt) + ǫo, where ǫo are unbiased and non trivial, with covariance matrix R given. Observation operator H maps the input parameters to the observation variables (can contain complex laws, PDEs, non linear physics, . . . ). We also assume that: H = H is a linear operator, ǫo and ǫb are not correlated.

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 16 / 57

slide-15
SLIDE 15

Stochastic data assimilation Best linear unbiased estimator (BLUE)

Problem position: what we look for

We aim at producing an estimate xa of the true state xt of the system. The best estimate is searched for as a linear combination of the background estimate and the observation: xa = L xb + K yo Optimality criterium We look for an unbiased estimate xa, with minimal variance tr(A).

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 17 / 57

slide-16
SLIDE 16

Stochastic data assimilation Best linear unbiased estimator (BLUE)

Best linear unbiased estimator, or least squares analysis

1

BLUE analysis:

  • xa = (I − KH) xb + K yo = xb + K(yo–H(xb))

K = BHT(HBHT + R)–1 K: gain, or weight matrix, yo–H(xb) innovation.

2

Analysis covariance matrix: A = (I–KH)B

3

Equivalent variational optimization problem: (optimal least squares)

  • xa = arg min J

J (x) = (x–xb)TB–1(x–xb) + (yo–H(x))TR–1(yo–H(x)) J : cost function.

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 18 / 57

slide-17
SLIDE 17

Stochastic data assimilation Best linear unbiased estimator (BLUE)

Data assimilation methods

Two types of methods:

1

Direct computation of the BLUE, and the gain matrix K. Main algorithm: Kalman filter − → stochastic data assimilation, this section.

2

Minimization of the cost function J using optimization and adjoint methods. Main algorithm: 4D-Var − → variational data assimilation, next section.

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 19 / 57

slide-18
SLIDE 18

Stochastic data assimilation Kalman filter algorithm

Time-dependant problems: the Kalman filter sequence

k − 2 k − 1 k t state

b

xf

u yo

b

xa

u

yo

b

xf

b

xa

u

yo

b

xf

b xa

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 21 / 57

slide-19
SLIDE 19

Stochastic data assimilation Kalman filter algorithm

Back to notations

Vectors: k time index xf

k forecast state (background), forecast error covariance matrix Pf k

xa

k analyzed state (result of the assimilation process), analysis error

covariance matrix Pa

k

Operators: model operator xt

k+1 = Mk,k+1(xt k) + ηk,k+1, model error ηk,k+1, covariance

matrix Qk

  • bservation operator yo

k = Hk(xt) + ǫo k, observation error ǫo k, covariance

matrix Rk Kalman’s hypotheses, schematically: Model and observations operators Mk,k+1 and Hk are linear ; Errors are unbiased, gaussian, independant and white in time.

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 22 / 57

slide-20
SLIDE 20

Stochastic data assimilation Kalman filter algorithm

Kalman’s hypotheses

Schematically: Model and observations operators are linear, denoted Mk,k+1 and Hk; Errors are unbiased, gaussian, independant and white in time.

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 23 / 57

slide-21
SLIDE 21

Stochastic data assimilation Kalman filter algorithm

Kalman’s hypotheses

Uncensored version: Initial state is gaussian: xt

0 ∼ N(xb 0, Pb 0);

The dynamical model Mk is linear and denoted Mk,k+1; The model errors are unbiased and gaussian: ηk ∼ N(0, Qk); The model errors are white in time: < ηkηT

j >= 0 if k = j;

The observation operators Hk are linear and denoted Hk; The observation errors are unbiased and gaussian: ǫo

k ∼ N(0, Rk);

The observation errors are white in time: < ǫo

kǫo j T >= 0 if k = j;

Errors of different types are independent: < ηkǫo

j T >= 0, < ηkǫb T >= 0,

< ǫo

kǫb T >= 0.

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 24 / 57

slide-22
SLIDE 22

Stochastic data assimilation Kalman filter algorithm

The Kalman filter equations

1

Initialization: xf

0 and Pf 0 are given, e.g. equal to xb and B

2

Analysis step: (BLUE analysis) Kk = (HkPf

k)T[Hk(HkPf k)T + Rk]−1,

xa

k

= xf

k + Kk(yo k − Hkxf k),

Pa

k

= (I − KkHk)Pf

k.

3

Forecast step: (evolution model) xf

k+1

= Mk,k+1xa

k,

Pf

k+1

= Mk,k+1Pa

kMT k,k+1 + Qk.

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 25 / 57

slide-23
SLIDE 23

Variational Data Assimilation Principle of variational methods

Back to notations

Vectors x state vector or input parameters xb background state (a priori information), background error ǫb = xb − xt, covariance matrix B xa analyzed state (result of the assimilation process) yo observation vector Operators model operator xt

k = Mk,k−1(xt k−1) = M0→k(xt 0)

  • bservation operator yo = H(xt) + ǫo, yo

k = Hk(xt) + ǫo k, observation error

ǫo

k, covariance matrix Rk

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 28 / 57

slide-24
SLIDE 24

Variational Data Assimilation Principle of variational methods

Back to the BLUE

Variational approach of BLUE consists in finding xa = arg min J : J (x) = (x–xb)TB–1(x–xb) + (yo–H(x))TR–1(yo–H(x)) = 1 2 x − xb2

B

  • J b

+ 1 2 H(x) − yo2

R

  • J o

If the problem is time-dependant, and the unknown x is the initial state vector: J (x) =

1 2 x − xb2 B + 1 2

  • k

Hk(xk) − yo

k2 Rk

= 1 2 x − xb2

B

  • J b

+ 1 2

  • k

Hk(M0→k(x)) − yo

k2 Rk

  • J o

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 29 / 57

slide-25
SLIDE 25

Variational Data Assimilation Principle of variational methods

Variational methods principle

J (x) =

1 2 x − xb2 B + 1 2

  • k Hk(xk) − yo

k2 Rk = J b + J o

=

1 2 x − xb2 B + 1 2

  • k Hk(M0→k(x)) − yo

k2 Rk 3DVAR Time X Assimilation window Previous forecast Corrected forecast

  • bs
  • bs
  • bs
  • bs
  • bs

Xa Jo Jo Jo Jo Jo Xb Jb

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 30 / 57

slide-26
SLIDE 26

Variational Data Assimilation Principle of variational methods

Wake up here!

Fundamental remark Once J is defined (i.e. once all the ingredients are chosen: control variables, error statistics, norms, observations...), the problem is entirely defined. Hence its solution. ⇒ The “physical part” of data assimilation lies in the definition of J . The rest of the job, i.e. minimizing J , is “technical” work.

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 31 / 57

slide-27
SLIDE 27

Variational Data Assimilation Gradient-based optimization

Descent methods

Descent methods to minimize a function require knowledge of (an estimate of) its gradient. xk+1 = xk + αkdk (k iteration number) with dk =            −∇J (xk) gradient method − [Hess(J )(xk)]−1 ∇J (xk) (quasi-)Newton method −∇J (xk) +

∇J (xk)2 ∇J (xk−1)2 dk−1

conjugate gradient ... ...

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 33 / 57

slide-28
SLIDE 28

Variational Data Assimilation Gradient-based optimization

Gradient computation

The computation of ∇J (x) may be difficult if the dependency of J with regard to the control variable x is not direct. Example: parameter estimation u(t) solution of a differential equation: −K1u′′(t) + K2u′(t) = f (t), t ∈]0, 1[ ; u(0) = u(1) = 0 x = K = (K1, K2) parameters of this equation uobs(t) an observation of u(t) J (K) = 1 2 u − uobs2 − → ∇J (K).k =< ˆ u[K](k), u − uobs > with ˆ u[K](k) = lim

α→0

uK+αk − uK α

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 34 / 57

slide-29
SLIDE 29

Variational Data Assimilation Gradient-based optimization

The computation of ∇J (x) may be difficult if the dependency of J with regard to the control variable x is not direct. It is often difficult (or even impossible) to obtain the gradient through the computation of growth rates. Example: initial state estimation du(t)) dt = M(u(t)) t ∈ [0, T] u(t = 0) = x with x =    x1 . . . xN    J (x) = 1 2 T u(t) − uobs(t)2 − → requires one model run ∇J (x) =       ∂J ∂x1 (x) . . . ∂J ∂xN (x)       ≃    [J (x + α e1) − J (x)] /α . . . [J (x + α eN) − J (x)] /α    − → N + 1 model runs

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 35 / 57

slide-30
SLIDE 30

Variational Data Assimilation Gradient-based optimization

Gradient computation

In actual applications like meteorology / oceanography, N = [x] = O(106 − 108) − → this method cannot be used. Adjoint method The adjoint method provides a very efficient way to compute ∇J , with only one run of the adjoint model (computationally 4-10 times the cost of the direct model). Attention! On the contrary, do not forget that, if the size of the control variable is very small (< 10), ∇J can be easily estimated by the computation of growth rates.

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 36 / 57

slide-31
SLIDE 31

Variational Data Assimilation Variational algorithms

Time-independant problems: 3D-Var

Cost function: J (x0) = (x0 − xb)TB−1(x0 − xb) + (y − H(x0))TR−1(y − H(x0)) Gradient: ∇J (x) = 2B−1(x − xb) − 2HTR−1(y − H(x)) Iterative minimization algorithm

  • Initialisation : x0 = xb, n = 0
  • While ∇J > ε or n ≤ nmax, do :

1

Compute J

2

Compute ∇J

3

Descent and update of x0

4

n = n + 1

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 38 / 57

slide-32
SLIDE 32

Variational Data Assimilation Variational algorithms

Time-dependant problems: 4D-Var

Cost function: J (x0) = x0 − xb

02 B + n

  • i=0

yo

i − Hi(Mi(Mi−1(. . . M1(x0)))2 R

Gradient involves the adjoint model: −1 2∇J o(x) =

n

  • i=0

MT

1 . . . MT i−1MT i HT i R−1 i

di Innovation vector di: di = y o

i − Hi(Mi(Mi−1(. . . M1(x))))

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 39 / 57

slide-33
SLIDE 33

Variational Data Assimilation Variational algorithms

4D-Var algorithm

  • Initialization : x = x0, n = 0
  • While ∇J > ε or n ≤ nmax, do :

1

Compute J thanks to the direct model M and the observation operator H

2

Store innovation vectors di

3

Compute ∇J thanks to the backward integration of the adjoint model MT and the adjoint of the observation operator HT

4

Descent and update of x

5

n = n + 1

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 40 / 57

slide-34
SLIDE 34

Variational Data Assimilation Variational algorithms

Equivalence 4D-Var – Kalman filter

Under the Kalman Filter hypotheses, 4D-Var and Kalman filter algorithms are equivalent. More precisely: starting with the same background, with equal covariance matrices at the beginning of the time-window, and assimilating the same

  • bservations, the same result is reached at the end of the time-window.

These algorithms are both optimal in a least squares and minimal variance point

  • f view.

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 41 / 57

slide-35
SLIDE 35

Implementation issues Non linearities

Linear assumption

Validity of BLUE analysis Kalman Filter and 4D-Var are valid (and produce equivalent results) if model

  • perator M and observation operator H are linear.

In presence of nonlinearities:

1

KF and 4DV not equivalent anymore,

2

  • ptimality of analysis is lost.

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 44 / 57

slide-36
SLIDE 36

Implementation issues Non linearities

Tangent linear hypothesis

In case of weak non linearities: we can hope the linear analysis still gives some information... Tangent linear hypothesis M0→i(x0) − M0→i(xb

0) ≃ M0→i(x0 − xb)

Hi(xi) − Hi(xb

i ) ≃ Hi(xi − xb i )

Stochastic − → extended Kalman Filter Variational − → incremental 3D- and 4D-Var

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 45 / 57

slide-37
SLIDE 37

Implementation issues Non linearities

Extended Kalman Filter

Non linear operators M and H are replaced with their tangent operators M and H in the algo, except in innovation and state forecast steps:

1

Initialization: xf

0 and Pf 0 are given, e.g. equal to xb and B

2

Analysis step: Kk = (HkPf

k)T[Hk(HkPf k)T + Rk]−1,

xa

k

= xf

k + Kk(yo k − Hkxf k),

Pa

k

= (I − KkHk)Pf

k.

3

Forecast step: xf

k+1

= Mk,k+1xa

k,

Pf

k+1

= Mk,k+1Pa

kMT k,k+1 + Qk.

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 46 / 57

slide-38
SLIDE 38

Implementation issues Non linearities

Incremental 4D-Var

4D-Var cost function: J (x) = x − xb2

B + n

  • k=0

yo

k − Hk(Mk(Mk−1(. . . M1(x)))2 R

Cost function linearized around xb, with increment δx: x = xb + δx Incremental innovation vector dk: dk = yo

k − Hk(Mk(Mk−1(. . . M1(xb))))

Incremental cost function (quadratic): J(δx) = δx2

B + n

  • k=0

dk − Hk(Mk(Mk−1(. . . M1(δx)))2

R

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 47 / 57

slide-39
SLIDE 39

Implementation issues Non linearities

Incremental 4D-Var

– Initialization : xr

0 = xb

Start outer loop Non linear model integration: xr

k = M0→k[xr]

Innovation vector computation: dk = yo

k − Hk(xr k)

Start inner loop Computation of J, using M and H linearized operators around xr Computation of ∇J, using adjoint operators MT and HT Minimization via a descent method End of inner loop Analysis increment update δxa

0 = δx0

Reference state update xr

0 = xr 0 + δxa

End of outer loop – Compute final analysis: xa

0 = xr 0, xa k = M0,k[xa 0].

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 48 / 57

slide-40
SLIDE 40

Implementation issues Non linearities

Incremental 4D-Var

(from YAO user’s guide) Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 49 / 57

slide-41
SLIDE 41

Implementation issues High dimensional problems

Matrix size problems

Vector and matrix sizes size of x: n size of yo: m = ⇒ size of B: n × n size of H: m × n For some applications, n and m are large (106 to 108) ⇒ impossible to store/compute/multiply/inverse data assimilation matrices B and K! Possible solutions to model covariance matrices: Rank reduction methods: e.g. replace B by SST where S is smaller (n × r, with r ≪ n) ; Ensemble modelling, using Monte-Carlo method.

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 51 / 57

slide-42
SLIDE 42

Implementation issues High dimensional problems

Rank reduction

Square root decomposition A symetric positive definite matrix B can be decomposed into SST, where S is a n × n matrix As before, if n is large, S cannot be computed/stored. Rank reduction consists in

1

choosing only a small number r of significative columns, to get matrix Sr with size n × r,

2

setting Br = SrST

r and hope for Br ≃ B.

Methods to compute columns of Sr: empirical orthogonal functions, principal component analysis, proper orthogonal decomposition, singular value decomposition, etc. − → SEEK filter, reduced-rank (incremental) 4D-Var

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 52 / 57

slide-43
SLIDE 43

Implementation issues High dimensional problems

Ensemble modelling

Monte-Carlo estimation If x1, x2, ..., xN are N realisations of x, then an estimator of E(x), based on the law of large numbers, is given by ˆ x = 1 N

N

  • i=1

xi Also, if x1, x2, ..., xN are N well-chosen states of a physical system, the background error covariance matrix B can be estimated by B ≃ ˆ B = 1 N − 1

N

  • i=1

(xi − ˆ x)(xi − ˆ x)T − → ensemble Kalman filter, 4D-Var with ensemble modelling of covariances.

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 53 / 57

slide-44
SLIDE 44

Implementation issues Gradient computation: adjoint method

Back to 4D-Var

Initial state estimation x = x0, cost function: J (x) = x − xb2

B + n

  • i=0

yo

i − Hi(Mi(Mi−1(. . . M1(x)))2 R

Gradient involves the adjoint model: −1 2∇J o(x) =

n

  • i=0

MT

1 . . . MT i−1MT i HT i R−1 i

di Innovation vector di: di = y o

i − Hi(Mi(Mi−1(. . . M1(x))))

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 55 / 57

slide-45
SLIDE 45

Implementation issues Gradient computation: adjoint method

Gradient computation

Gradient of J o factorization: − 1

2∇J o(x)

=

n

  • i=0

MT

1 . . . MT i−1MT i HT i R−1 i

di = HT

0 R−1 0 d0 + MT 1 HT 1 R−1 1 d1 + MT 1 MT 2 HT 2 R−1 2 d2 + . . . +

MT

1 . . . MT n−1MT n HT n R−1 n dn

= HT

0 R−1 0 d0 + MT 1

  • HT

1 R−1 1 d1 + MT 2 [HT 2 R−1 2 d2 + . . . +

MT

n HT n R−1 n dn]

  • Adjoint model
  • x∗

k

= MT

k+1x∗ k+1 + HT k R−1 k dk, k = n, 0

x∗

n

= HT

n R−1 n dn

⇒ ∇J o = −2x∗

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 56 / 57

slide-46
SLIDE 46

Implementation issues Gradient computation: adjoint method

Automatic differentiation

Adjoint code construction can be very technical work: time-dependance and non-linearities non-differentiabilities, thresholds iterative solvers Community portal for automatic differentiation: http://www.autodiff.org Our favorite, with advanced features and tailored for all kind of applications, even large-scale time dependant: Tapenade http://www-sop.inria.fr/tropics

Ma¨ elle Nodet (Grenoble) Introduction to Data Assimilation 6 mars 2012 57 / 57