Variational approach to data assimilation: optimization aspects and - - PowerPoint PPT Presentation

variational approach to data
SMART_READER_LITE
LIVE PREVIEW

Variational approach to data assimilation: optimization aspects and - - PowerPoint PPT Presentation

Variational approach to data assimilation: optimization aspects and adjoint method Eric Blayo University of Grenoble and INRIA A Objectives I introduce data assimilation as an optimization problem I discuss the di ff erent forms of the


slide-1
SLIDE 1

A

Variational approach to data assimilation: optimization aspects

and adjoint method

Eric Blayo University of Grenoble and INRIA

slide-2
SLIDE 2

Objectives

I introduce data assimilation as an optimization problem I discuss the different forms of the objective functions I discuss their properties w.r.t. optimization I introduce the adjoint technique for the computation of the

gradient Link with statistical methods: cf lectures by E. Cosme Variational data assimilation algorithms, tangent and adjoint codes: cf lectures by M. Nodet and A. Vidard

  • E. Blayo - Variational approach to data assimilation
slide-3
SLIDE 3

Introduction: model problem

Outline

Introduction: model problem Definition and minimization of the cost function The adjoint method

  • E. Blayo - Variational approach to data assimilation
slide-4
SLIDE 4

Introduction: model problem

Model problem

Two different available measurements of a single quantity. Which estimation of its true value ? ! least squares approach

  • E. Blayo - Variational approach to data assimilation
slide-5
SLIDE 5

Introduction: model problem

Model problem

Two different available measurements of a single quantity. Which estimation of its true value ? ! least squares approach Example 2 obs y1 = 19C and y2 = 21C of the (unknown) present temperature x.

I Let J(x) = 1 2

⇥ (x y1)2 + (x y2)2⇤

I Minx J(x)

  • ! ˆ

x = y1 + y2 2 = 20C

  • E. Blayo - Variational approach to data assimilation
slide-6
SLIDE 6

Introduction: model problem

Model problem

Observation operator If 6= units: y1 = 66.2F and y2 = 69.8F

I Let H(x) = 9

5x + 32

I Let J(x) = 1

2 ⇥ (H(x) y1)2 + (H(x) y2)2⇤

I Minx J(x)

  • ! ˆ

x = 20C

  • E. Blayo - Variational approach to data assimilation
slide-7
SLIDE 7

Introduction: model problem

Model problem

Observation operator If 6= units: y1 = 66.2F and y2 = 69.8F

I Let H(x) = 9

5x + 32

I Let J(x) = 1

2 ⇥ (H(x) y1)2 + (H(x) y2)2⇤

I Minx J(x)

  • ! ˆ

x = 20C Drawback # 1: if observation units are inhomogeneous y1 = 66.2F and y2 = 21C

I J(x) = 1

2 ⇥ (H(x) y1)2 + (x y2)2⇤

  • ! ˆ

x = 19.47C !!

  • E. Blayo - Variational approach to data assimilation
slide-8
SLIDE 8

Introduction: model problem

Model problem

Observation operator If 6= units: y1 = 66.2F and y2 = 69.8F

I Let H(x) = 9

5x + 32

I Let J(x) = 1

2 ⇥ (H(x) y1)2 + (H(x) y2)2⇤

I Minx J(x)

  • ! ˆ

x = 20C Drawback # 1: if observation units are inhomogeneous y1 = 66.2F and y2 = 21C

I J(x) = 1

2 ⇥ (H(x) y1)2 + (x y2)2⇤

  • ! ˆ

x = 19.47C !! Drawback # 2: if observation accuracies are inhomogeneous If y1 is twice more accurate than y2, one should obtain ˆ x = 2y1 + y2 3 = 19.67C

  • ! J should be J(x) = 1

2 "✓x y1 1/2 ◆2 + ✓x y2 1 ◆2#

  • E. Blayo - Variational approach to data assimilation
slide-9
SLIDE 9

Introduction: model problem

Model problem

General form

Minimize J(x) = 1 2 (H1(x) y1)2 σ2

1

+ (H2(x) y2)2 σ2

2

  • E. Blayo - Variational approach to data assimilation
slide-10
SLIDE 10

Introduction: model problem

Model problem

General form

Minimize J(x) = 1 2 (H1(x) y1)2 σ2

1

+ (H2(x) y2)2 σ2

2

  • If H1 = H2 = Id:

J(x) = 1 2 (x y1)2 σ2

1

+ 1 2 (x y2)2 σ2

2

which leads to ˆ x = 1 σ2

1

y1 + 1 σ2

2

y2 1 σ2

1

+ 1 σ2

2

(weighted average)

  • E. Blayo - Variational approach to data assimilation
slide-11
SLIDE 11

Introduction: model problem

Model problem

General form

Minimize J(x) = 1 2 (H1(x) y1)2 σ2

1

+ (H2(x) y2)2 σ2

2

  • If H1 = H2 = Id:

J(x) = 1 2 (x y1)2 σ2

1

+ 1 2 (x y2)2 σ2

2

which leads to ˆ x = 1 σ2

1

y1 + 1 σ2

2

y2 1 σ2

1

+ 1 σ2

2

(weighted average) Remark: J”(ˆ x) | {z } convexity = 1 σ2

1

+ 1 σ2

2

= [Var(ˆ x)]1 | {z } accuracy (cf BLUE)

  • E. Blayo - Variational approach to data assimilation
slide-12
SLIDE 12

Introduction: model problem

Model problem

Alternative formulation: background + observation If one considers that y1 is a prior (or background) estimate xb for x, and y2 = y is an independent observation, then: J(x) = 1 2 (x xb)2 σ2

b

| {z } Jb + 1 2 (x y)2 σ2

  • |

{z } Jo and ˆ x = 1 σ2

b

xb + 1 σ2

  • y

1 σ2

b

+ 1 σ2

  • = xb +

σ2

b

σ2

b + σ2

  • |

{z } gain (y xb) | {z } innovation

  • E. Blayo - Variational approach to data assimilation
slide-13
SLIDE 13

Definition and minimization of the cost function

Outline

Introduction: model problem Definition and minimization of the cost function Least squares problems Linear (time independent) problems The adjoint method

  • E. Blayo - Variational approach to data assimilation
slide-14
SLIDE 14

Definition and minimization of the cost function Least squares problems

Outline

Introduction: model problem Definition and minimization of the cost function Least squares problems Linear (time independent) problems The adjoint method

  • E. Blayo - Variational approach to data assimilation
slide-15
SLIDE 15

Definition and minimization of the cost function Least squares problems

Generalization: arbitrary number of unknowns and observations

To be estimated: x = B @ x1 . . . xn 1 C A 2 I Rn Observations: y = B @ y1 . . . yp 1 C A 2 I Rp Observation operator: y ⌘ H(x), with H : I Rn ! I Rp

  • E. Blayo - Variational approach to data assimilation
slide-16
SLIDE 16

Definition and minimization of the cost function Least squares problems

Generalization: arbitrary number of unknowns and observations A simple example of observation operator

If x = B B @ x1 x2 x3 x4 1 C C A and y = ✓

an observation of x1+x2

2

an observation of x4

◆ then H(x) = Hx with H = @ 1 2 1 2 1 1 A

  • E. Blayo - Variational approach to data assimilation
slide-17
SLIDE 17

Definition and minimization of the cost function Least squares problems

Generalization: arbitrary number of unknowns and observations

To be estimated: x = B @ x1 . . . xn 1 C A 2 I Rn Observations: y = B @ y1 . . . yp 1 C A 2 I Rp Observation operator: y ⌘ H(x), with H : I Rn ! I Rp Cost function: J(x) = 1 2 kH(x) yk2 with k.k to be chosen.

  • E. Blayo - Variational approach to data assimilation
slide-18
SLIDE 18

Definition and minimization of the cost function Least squares problems

Reminder: norms and scalar products

u = B @ u1 . . . un 1 C A 2 I Rn ⌘ Euclidian norm: kuk2 = uTu =

n

X

i=1

u2

i

Associated scalar product: (u, v) = uTv =

n

X

i=1

uivi ⌘ Generalized norm: let M a symmetric positive definite matrix M-norm: kuk2

M = uTM u = n

X

i=1 n

X

j=1

mij uiuj Associated scalar product: (u, v)M = uTM v =

n

X

i=1 n

X

j=1

mij uivj

  • E. Blayo - Variational approach to data assimilation
slide-19
SLIDE 19

Definition and minimization of the cost function Least squares problems

Generalization: arbitrary number of unknowns and observations

To be estimated: x = B @ x1 . . . xn 1 C A 2 I Rn Observations: y = B @ y1 . . . yp 1 C A 2 I Rp Observation operator: y ⌘ H(x), with H : I Rn ! I Rp Cost function: J(x) = 1 2 kH(x) yk2 with k.k to be chosen.

  • E. Blayo - Variational approach to data assimilation
slide-20
SLIDE 20

Definition and minimization of the cost function Least squares problems

Generalization: arbitrary number of unknowns and observations

To be estimated: x = B @ x1 . . . xn 1 C A 2 I Rn Observations: y = B @ y1 . . . yp 1 C A 2 I Rp Observation operator: y ⌘ H(x), with H : I Rn ! I Rp Cost function: J(x) = 1 2 kH(x) yk2 with k.k to be chosen. (Intuitive) necessary (but not sufficient) condition for the existence

  • f a unique minimum:

p n

  • E. Blayo - Variational approach to data assimilation
slide-21
SLIDE 21

Definition and minimization of the cost function Least squares problems

Formalism “background value + new observations”

Y = ✓ xb y ◆ background new obs The cost function becomes: J(x) = 1 2 kx xbk2

b

| {z } Jb + 1 2 kH(x) yk2

  • |

{z } Jo

  • E. Blayo - Variational approach to data assimilation
slide-22
SLIDE 22

Definition and minimization of the cost function Least squares problems

Formalism “background value + new observations”

Y = ✓ xb y ◆ background new obs The cost function becomes: J(x) = 1 2 kx xbk2

b

| {z } Jb + 1 2 kH(x) yk2

  • |

{z } Jo = (x xb)TB1(x xb) + (H(x) y)TR1(H(x) y)

  • E. Blayo - Variational approach to data assimilation
slide-23
SLIDE 23

Definition and minimization of the cost function Least squares problems

Formalism “background value + new observations”

Y = ✓ xb y ◆ background new obs The cost function becomes: J(x) = 1 2 kx xbk2

b

| {z } Jb + 1 2 kH(x) yk2

  • |

{z } Jo = (x xb)TB1(x xb) + (H(x) y)TR1(H(x) y) The necessary condition for the existence of a unique minimum (p n) is automatically fulfilled.

  • E. Blayo - Variational approach to data assimilation
slide-24
SLIDE 24

Definition and minimization of the cost function Least squares problems

If the problem is time dependent

I Observations are distributed in time: y = y(t). I The observation cost function becomes:

Jo(x) = 1 2

N

X

i=0

kHi(x(ti)) y(ti)k2

  • E. Blayo - Variational approach to data assimilation
slide-25
SLIDE 25

Definition and minimization of the cost function Least squares problems

If the problem is time dependent

I Observations are distributed in time: y = y(t). I The observation cost function becomes:

Jo(x) = 1 2

N

X

i=0

kHi(x(ti)) y(ti)k2

  • I There is a model describing the evolution of x: dx

dt = M(x) with x(t = 0) = x0. Then J is often no longer minimized w.r.t. x, but w.r.t. x0 only, or to some other parameters. Jo(x0) = 1 2

N

X

i=0

kHi(x(ti))y(ti)k2

  • = 1

2

N

X

i=0

kHi(M0!ti(x0))y(ti)k2

  • E. Blayo - Variational approach to data assimilation
slide-26
SLIDE 26

Definition and minimization of the cost function Least squares problems

If the problem is time dependent

J(x0) = 1 2 kx0 xb

0k2 b

| {z } background term Jb + 1 2

N

X

i=0

kHi(x(ti)) y(ti)k2

  • |

{z }

  • bservation term Jo
  • E. Blayo - Variational approach to data assimilation
slide-27
SLIDE 27

Definition and minimization of the cost function Least squares problems

Uniqueness of the minimum ?

J(x0) = Jb(x0)+Jo(x0) = 1 2 kx0 xbk2

b + 1

2

N

X

i=0

kHi(M0!ti(x0))y(ti)k2

  • I If H and M are linear then Jo is quadratic.
  • E. Blayo - Variational approach to data assimilation
slide-28
SLIDE 28

Definition and minimization of the cost function Least squares problems

Uniqueness of the minimum ?

J(x0) = Jb(x0)+Jo(x0) = 1 2 kx0 xbk2

b + 1

2

N

X

i=0

kHi(M0!ti(x0))y(ti)k2

  • I If H and M are linear then Jo is quadratic.

I However it generally does not have a unique minimum, since the

number of observations is generally less than the size of x0 (the problem is underdetermined: p < n).

Example: let (xt

1, xt 2) = (1, 1) and y = 1.1 an observa-

tion of 1

2 (x1 + x2).

Jo(x1, x2) = 1 2 ✓ x1 + x2 2 1.1 ◆2

  • E. Blayo - Variational approach to data assimilation
slide-29
SLIDE 29

Definition and minimization of the cost function Least squares problems

Uniqueness of the minimum ?

J(x0) = Jb(x0)+Jo(x0) = 1 2 kx0 xbk2

b + 1

2

N

X

i=0

kHi(M0!ti(x0))y(ti)k2

  • I If H and M are linear then Jo is quadratic.

I However it generally does not have a unique minimum, since the

number of observations is generally less than the size of x0 (the problem is underdetermined).

I Adding Jb makes the problem of minimizing J = Jo + Jb well posed. Example: let (xt

1, xt 2) = (1, 1) and y = 1.1 an observa-

tion of 1

2 (x1 + x2). Let (xb 1 , xb 2 ) = (0.9, 1.05)

J(x1, x2) = 1 2 ✓ x1 + x2 2 1.1 ◆2 | {z }

Jo

+ 1 2 ⇥ (x1 0.9)2 + (x2 1.05)2⇤ | {z }

Jb

  • ! (x∗

1 , x∗ 2 ) = (0.94166..., 1.09166...)

  • E. Blayo - Variational approach to data assimilation
slide-30
SLIDE 30

Definition and minimization of the cost function Least squares problems

Uniqueness of the minimum ?

J(x0) = Jb(x0)+Jo(x0) = 1 2 kx0 xbk2

b + 1

2

N

X

i=0

kHi(M0!ti(x0))y(ti)k2

  • I If H and/or M are nonlinear then Jo is no longer quadratic.
  • E. Blayo - Variational approach to data assimilation
slide-31
SLIDE 31

Definition and minimization of the cost function Least squares problems

Uniqueness of the minimum ?

J(x0) = Jb(x0)+Jo(x0) = 1 2 kx0 xbk2

b + 1

2

N

X

i=0

kHi(M0!ti(x0))y(ti)k2

  • I If H and/or M are nonlinear then Jo is no longer quadratic.

Example: the Lorenz system (1963) 8 > > > > > > < > > > > > > : dx dt = α(y x) dy dt = βx y xz dz dt = γz + xy

  • E. Blayo - Variational approach to data assimilation
slide-32
SLIDE 32

Definition and minimization of the cost function Least squares problems

http://www.chaos-math.org

  • E. Blayo - Variational approach to data assimilation
slide-33
SLIDE 33

Definition and minimization of the cost function Least squares problems

Uniqueness of the minimum ?

J(x0) = Jb(x0)+Jo(x0) = 1 2 kx0 xbk2

b + 1

2

N

X

i=0

kHi(M0!ti(x0))y(ti)k2

  • I If H and/or M are nonlinear then Jo is no longer quadratic.
  • E. Blayo - Variational approach to data assimilation
slide-34
SLIDE 34

Definition and minimization of the cost function Least squares problems

Uniqueness of the minimum ?

J(x0) = Jb(x0)+Jo(x0) = 1 2 kx0 xbk2

b + 1

2

N

X

i=0

kHi(M0!ti(x0))y(ti)k2

  • I If H and/or M are nonlinear then Jo is no longer quadratic.

Example: the Lorenz system (1963) 8 > > > > > > < > > > > > > : dx dt = α(y x) dy dt = βx y xz dz dt = γz + xy Jo(y0) = 1 2

N

X

i=0

(x(ti) xobs(ti))2 dt

  • E. Blayo - Variational approach to data assimilation
slide-35
SLIDE 35

Definition and minimization of the cost function Least squares problems

Uniqueness of the minimum ?

J(x0) = Jb(x0)+Jo(x0) = 1 2 kx0 xbk2

b + 1

2

N

X

i=0

kHi(M0!ti(x0))y(ti)k2

  • I If H and/or M are nonlinear then Jo is no longer quadratic.
  • E. Blayo - Variational approach to data assimilation
slide-36
SLIDE 36

Definition and minimization of the cost function Least squares problems

Uniqueness of the minimum ?

J(x0) = Jb(x0)+Jo(x0) = 1 2 kx0 xbk2

b + 1

2

N

X

i=0

kHi(M0!ti(x0))y(ti)k2

  • I If H and/or M are nonlinear then Jo is no longer quadratic.

I Adding Jb makes it “more quadratic” (Jb is a regularization term),

but J = Jo + Jb may however have several (local) minima.

  • E. Blayo - Variational approach to data assimilation
slide-37
SLIDE 37

Definition and minimization of the cost function Least squares problems

A fundamental remark before going into minimization aspects

Once J is defined (i.e. once all the ingredients are chosen: control variables, norms, observations. . . ), the problem is entirely defined. Hence its solution. The “physical” (i.e. the most important) part of data assimilation lies in the definition of J. The rest of the job, i.e. minimizing J, is “only” technical work.

  • E. Blayo - Variational approach to data assimilation
slide-38
SLIDE 38

Definition and minimization of the cost function Linear (time independent) problems

Outline

Introduction: model problem Definition and minimization of the cost function Least squares problems Linear (time independent) problems The adjoint method

  • E. Blayo - Variational approach to data assimilation
slide-39
SLIDE 39

Definition and minimization of the cost function Linear (time independent) problems

Reminder: norms and scalar products

u = B @ u1 . . . un 1 C A 2 I Rn ⌘ Euclidian norm: kuk2 = uTu =

n

X

i=1

u2

i

Associated scalar product: (u, v) = uTv =

n

X

i=1

uivi ⌘ Generalized norm: let M a symmetric positive definite matrix M-norm: kuk2

M = uTM u = n

X

i=1 n

X

j=1

mij uiuj Associated scalar product: (u, v)M = uTM v =

n

X

i=1 n

X

j=1

mij uivj

  • E. Blayo - Variational approach to data assimilation
slide-40
SLIDE 40

Definition and minimization of the cost function Linear (time independent) problems

Reminder: norms and scalar products

u : Ω ⇢ I Rn

  • ! I

R x

  • ! u(x)

u 2 L2(Ω) ⌘ Euclidian (or L2) norm: kuk2 = Z

u2(x) dx Associated scalar product: (u, v) = Z

u(x) v(x) dx

  • E. Blayo - Variational approach to data assimilation
slide-41
SLIDE 41

Definition and minimization of the cost function Linear (time independent) problems

Reminder: derivatives and gradients

f : E ! I R (E being of finite or infinite dimension) ⌘ Directional (or Gˆ ateaux) derivative of f at point x 2 E in direction d 2 E: ∂f ∂d (x) = ˆ f [x](d) = lim

α!0

f (x + αd) f (x) α

Example: partial derivatives ∂f ∂xi are directional derivatives in the direction of the members of the canonical basis (d = ei)

  • E. Blayo - Variational approach to data assimilation
slide-42
SLIDE 42

Definition and minimization of the cost function Linear (time independent) problems

Reminder: derivatives and gradients

f : E ! I R (E being of finite or infinite dimension) ⌘ Gradient (or Fr´ echet derivative): E being an Hilbert space, f is Fr´ echet differentiable at point x 2 E iff 9p 2 E such that f (x + h) = f (x) + (p, h) + o(khk) 8h 2 E p is the derivative or gradient of f at point x, denoted f 0(x) or rf (x). ⌘ h ! (p(x), h) is a linear function, called differential function or tangent linear function or Jacobian of f at point x

  • E. Blayo - Variational approach to data assimilation
slide-43
SLIDE 43

Definition and minimization of the cost function Linear (time independent) problems

Reminder: derivatives and gradients

f : E ! I R (E being of finite or infinite dimension) ⌘ Gradient (or Fr´ echet derivative): E being an Hilbert space, f is Fr´ echet differentiable at point x 2 E iff 9p 2 E such that f (x + h) = f (x) + (p, h) + o(khk) 8h 2 E p is the derivative or gradient of f at point x, denoted f 0(x) or rf (x). ⌘ h ! (p(x), h) is a linear function, called differential function or tangent linear function or Jacobian of f at point x ⌘ Important (obvious) relationship: ∂f ∂d (x) = (rf (x), d)

  • E. Blayo - Variational approach to data assimilation
slide-44
SLIDE 44

Definition and minimization of the cost function Linear (time independent) problems

Minimum of a quadratic function in finite dimension

Theorem: Generalized (or Moore-Penrose) inverse

Let M a p ⇥ n matrix, with rank n, and b 2 I Rp. (hence p n) Let J(x) = kMx bk2 = (Mx b)T(Mx b). J is minimum for ˆ x = M+b , where M+ = (MTM)1MT (generalized, or Moore-Penrose, inverse).

  • E. Blayo - Variational approach to data assimilation
slide-45
SLIDE 45

Definition and minimization of the cost function Linear (time independent) problems

Minimum of a quadratic function in finite dimension

Theorem: Generalized (or Moore-Penrose) inverse

Let M a p ⇥ n matrix, with rank n, and b 2 I Rp. (hence p n) Let J(x) = kMx bk2 = (Mx b)T(Mx b). J is minimum for ˆ x = M+b , where M+ = (MTM)1MT (generalized, or Moore-Penrose, inverse).

Corollary: with a generalized norm

Let N a p ⇥ p symmetric definite positive matrix. Let J1(x) = kMx bk2

N = (Mx b)TN (Mx b).

J1 is minimum for ˆ x = (MTNM)1MTN b.

  • E. Blayo - Variational approach to data assimilation
slide-46
SLIDE 46

Definition and minimization of the cost function Linear (time independent) problems

Link with data assimilation

This gives the solution to the problem min

x2I

R

n Jo(x) = 1

2 kHx yk2

  • in the case of a linear observation operator H.

Jo(x) = 1 2 (Hxy)TR1(Hxy) ! ˆ x = (HTR1H)1HTR1 y

  • E. Blayo - Variational approach to data assimilation
slide-47
SLIDE 47

Definition and minimization of the cost function Linear (time independent) problems

Link with data assimilation

Similarly: J(x) = Jb(x) + Jo(x) = 1 2 kx xbk2

b

+ 1 2 kH(x) yk2

  • =

1 2 (x xb)TB1(x xb) + 1 2 (Hx y)TR1(Hx y) = (Mx b)TN (Mx b) = kMx bk2

N

with M = ✓ In H ◆ b = ✓ xb y ◆ N = ✓ B1 R1 ◆

  • E. Blayo - Variational approach to data assimilation
slide-48
SLIDE 48

Definition and minimization of the cost function Linear (time independent) problems

Link with data assimilation

Similarly: J(x) = Jb(x) + Jo(x) = 1 2 kx xbk2

b

+ 1 2 kH(x) yk2

  • =

1 2 (x xb)TB1(x xb) + 1 2 (Hx y)TR1(Hx y) = (Mx b)TN (Mx b) = kMx bk2

N

with M = ✓ In H ◆ b = ✓ xb y ◆ N = ✓ B1 R1 ◆ which leads to ˆ x = xb + (B1 + HTR1H)1HTR1 | {z } gain matrix (y Hxb) | {z } innovation vector

Remark: The gain matrix also reads BHT(HBHT + R)1

(Sherman-Morrison-Woodbury formula)

  • E. Blayo - Variational approach to data assimilation
slide-49
SLIDE 49

Definition and minimization of the cost function Linear (time independent) problems

Link with data assimilation

Remark

Hess(J) | {z } convexity = B1 + HTR1H = [Cov(ˆ x)]1 | {z } accuracy (cf BLUE)

  • E. Blayo - Variational approach to data assimilation
slide-50
SLIDE 50

Definition and minimization of the cost function Linear (time independent) problems

Remark

Given the size of n and p, it is generally impossible to handle explicitly H, B and R. So the direct computation of the gain matrix is impossible. ⌘ even in the linear case (for which we have an explicit expression for ˆ x), the computation of ˆ x is performed using an optimization algorithm.

  • E. Blayo - Variational approach to data assimilation
slide-51
SLIDE 51

The adjoint method

Outline

Introduction: model problem Definition and minimization of the cost function The adjoint method Rationale A simple example A more complex (but still linear) example Control of the initial condition The adjoint method as a constrained minimization

  • E. Blayo - Variational approach to data assimilation
slide-52
SLIDE 52

The adjoint method Rationale

Outline

Introduction: model problem Definition and minimization of the cost function The adjoint method Rationale A simple example A more complex (but still linear) example Control of the initial condition The adjoint method as a constrained minimization

  • E. Blayo - Variational approach to data assimilation
slide-53
SLIDE 53

The adjoint method Rationale

Descent methods

Descent methods for minimizing the cost function require the knowledge

  • f (an estimate of) its gradient.

xk+1 = xk + αk dk with dk = 8 > > > > > < > > > > > : rJ(xk) gradient method [Hess(J)(xk)]1 rJ(xk) Newton method Bk rJ(xk) quasi-Newton methods (BFGS, . . . ) rJ(xk) +

krJ(xk)k2 krJ(xk−1)k2 dk1

conjugate gradient ... ...

  • E. Blayo - Variational approach to data assimilation
slide-54
SLIDE 54

The adjoint method Rationale

The computation of rJ(xk) may be difficult if the dependency of J with regard to the control variable x is not direct. Example:

I u(x) solution of an ODE I K a coefficient of this ODE I uobs(x) an observation of u(x) I

J(K) = 1 2 ku(x) uobs(x)k2

  • E. Blayo - Variational approach to data assimilation
slide-55
SLIDE 55

The adjoint method Rationale

The computation of rJ(xk) may be difficult if the dependency of J with regard to the control variable x is not direct. Example:

I u(x) solution of an ODE I K a coefficient of this ODE I uobs(x) an observation of u(x) I

J(K) = 1 2 ku(x) uobs(x)k2 ˆ J[K](k) = (rJ(K), k) =< ˆ u, u uobs > with ˆ u = ∂u ∂k (K) = lim

α!0

uK+αk uK α

  • E. Blayo - Variational approach to data assimilation
slide-56
SLIDE 56

The adjoint method Rationale

It is often difficult (or even impossible) to obtain the gradient through the computation of growth rates. Example: ( dx(t)) dt = M(x(t)) t 2 [0, T] x(t = 0) = u with u = B @ u1 . . . uN 1 C A J(u) = 1 2 Z T kx(t) xobs(t)k2

  • ! requires one model run

rJ(u) = B B B B @ ∂J ∂u1 (u) . . . ∂J ∂uN (u) 1 C C C C A ' B @ [J(u + α e1) J(u)] /α . . . [J(u + α eN) J(u)] /α 1 C A

  • ! N + 1 model runs
  • E. Blayo - Variational approach to data assimilation
slide-57
SLIDE 57

The adjoint method Rationale

In most actual applications, N = [u] is large (or even very large: e.g. N = O(108 109) in meteorology)

  • ! this method cannot be used.

Alternatively, the adjoint method provides a very efficient way to compute rJ.

  • E. Blayo - Variational approach to data assimilation
slide-58
SLIDE 58

The adjoint method Rationale

In most actual applications, N = [u] is large (or even very large: e.g. N = O(108 109) in meteorology)

  • ! this method cannot be used.

Alternatively, the adjoint method provides a very efficient way to compute rJ. On the contrary, do not forget that, if the size of the control variable is very small (< 10 20), rJ can be easily estimated by the computation of growth rates.

  • E. Blayo - Variational approach to data assimilation