SLIDE 1
Course on Inverse Problems Albert Tarantola Lesson XVII: - - PowerPoint PPT Presentation
Course on Inverse Problems Albert Tarantola Lesson XVII: - - PowerPoint PPT Presentation
Institut de Physique du Globe de Paris & Universit Pierre et Marie Curie (Paris VI) Course on Inverse Problems Albert Tarantola Lesson XVII: Least-Squares Involving Functions mathematica notebook II: Functional Formulation The a
SLIDE 2
SLIDE 3
II: Functional Formulation The a priori information is represented by mprior = {mprior(z)} ; Cprior = {Cprior(z, z′)} . The observable parameters are of the kind
- i =
- dz Oi(z) m(z)
, where, in our example O1(z) is a delta function and O2(z) is a box-car function. This corresponds to a linear relation between the model function m and the observations o :
- = O m
. We have some observations
- obs = {oi
- bs}
; Cobs = {Cij
- bs}
.
SLIDE 4
The solution is provided by the standard least-squares equa- tion, that we just need to interpret. The first equation is mpost = mprior − Cprior Ot ( O Cprior Ot + Cobs )-1 ( O mprior − oobs ) , i.e., mpost = mprior − P Q r P = Cprior Ot S = O Cprior Ot + Cobs Q = S-1 r = O mprior − oobs
SLIDE 5
This leads to mpost(z) = mprior(z) − ∑
i ∑ j
Pi(z) Qij rj Pi(z) =
- dz′ Cprior(z, z′) Oi(z′)
Sij =
- dz
- dz′ Oi(z) Cprior(z, z′) Oj(z′) + Cij
- bs
Q = S-1 ri =
- dz Oi(z) mprior(z) − oi
- bs
SLIDE 6
The second equation is Cpost = Cprior − Cprior Ot ( O Cprior Ot + Cobs )-1 O Cprior , i.e., Cpost = Cprior − P Q Pt P = Cprior Ot S = O Cprior Ot + Cobs Q = S-1
SLIDE 7
This leads to Cpost(z, z′) = Cprior(z, z′) − ∑
i ∑ j
Pi(z) Qij Pj(z′) Pi(z) =
- dz′ Cprior(z, z′) Oi(z′)
Sij =
- dz
- dz′ Oi(z) Cprior(z, z′) Oj(z′) + Cij
- bs
Q = S-1
⇒ mathematica notebook
SLIDE 8
Let me bring a clarification. We have examined two of the many possible algorithms providing the mean posterior model npost . The first one was the steepest-descent algorithm nk+1 = nk − µ( Cprior Lt C-1
- bs(L nk − tobs) + (nk − nprior) )
that could advantageously be replaced by a preconditioned steepest descent algorithm, nk+1 = nk − P ( Cprior Lt C-1
- bs(L nk − tobs) + (nk − nprior) )
where P is a suitably chosen, but arbitrary, positive definite
- perator. A good choice of P will accelerate convergence (but
not change the final result). Let us guess what P could be.
SLIDE 9
The first iteration, when choosing n0 = nprior gives n1 = nprior − P Cprior Lt C-1
- bs(L nprior − tobs)
We also saw the Newton algorithm npost = nprior − Cprior Lt (L CpriorLt + Cobs)-1 (L nprior − tobs) that, because the forward relation is here linear, converges in just one step. Equivalent to the last equation is (see lesson XI) npost = nprior − ( Lt C-1
- bs L + C-1
prior )-1 Lt C-1
- bs ( L nprior − oobs)
that can also be written as npost = nprior − Π Cprior Lt C-1
- bs ( L nprior − oobs)
, with Π = ( Cprior (Lt C-1
- bs L + C-1
prior) )-1 .
SLIDE 10
The conclusion is that the more the (arbitrary) preconditioning
- perator P is close to
Π = ( Cprior (Lt C-1
- bs L + C-1
prior) )-1
, the faster the preconditioned steepest descent algorithm will converge (and in only one step if P = Π , but this is usually impossible). Simplest example: let be Q = Cprior (Lt C-1
- bs L + C-1
prior) , so we
must use P ≈ Q-1 . Letting Q(x, x′) be the kernel of Q , one may choose to approximate Q-1 by the inverse of its diagonal elements, P(x, x′) = 1 Q(x, x) δ(x − x′) . Evaluating Q(x, x) grossly corresponds to “counting how ma- ny rays pass through point x ”.
SLIDE 11
I have already explained that, in the context of least-squares, and given a covariance function C(z, z′) , the norm of any func- tion m = {m(z)} is to be defined via
m 2 = ( m , m ) = C-1 m , m
. Following a discussion by Prof. Mark Simons, I realized that I have not mentioned that when considering the exponential covariance function C(z, z′) = σ2 exp(−|z − z′| L
)
,
- ne obtains the simple and interesting result (Tarantola, 2005,
page 311)
m 2 =
1 σ2 1 L
- dz m(z)2 + L
- dz
dm dz (z) 2 .
SLIDE 12
This implies that when solving the basic least-squares prob- lem, i.e., the problem of finding the model mpost that mini- mizes the misfit function 2 S(m) = o(m) − oobs 2
Cobs + m − mprior 2 Cprior
= C-1
- bs (o(m) − oobs) , (o(m) − oobs)
+ C-1
prior (m − mprior) , (m − mprior)
, we are imposing that mpost(z) − mprior(z) must be small, but we are also imposing that dmpost
dz (z) − dmprior dz
(z) must be small.
In particular, if mprior(z) is smooth, then mpost(z) must also be smooth. See page 316 of my book for the 3D case.
SLIDE 13
In lesson XII, I introduced the definition of transpose operator: Let L be a linear operator from linear space A into linear space B . The transpose of L , denoted Lt , is a linear oper- ator from B∗ into A∗ , defined by the condition that for any β ∈ B∗ and for any a ∈ A ,
β , L a B = Lt β , a A
.
SLIDE 14
Then, without demonstration, I gave two examples: If the expression b = L a means bi = ∑
I
LiI aI then, an expression like α = Lt β means αI = ∑
i
LiI βi . If the expression b = L a means b(t) =
- dV(x) L(t, x) a(x)
, then, an expression like α = Lt β means α(x) =
- dt L(t, x) β(t)
. Ozgun, like St. Thomas, asks for the demonstrations.
SLIDE 15
Case s = L v ⇔ si = ∑α Liα vα . There must be two duality products:
ω , v = ∑
α
ωα vα ;
σ , s = ∑
i
σi si . The transpose operator, say M = Lt , will operate on some σ to give some ω : ω = M σ ⇔ ωα = ∑i Mαi σi . Relation be- tween Mαi and Liα ? For any v and any σ one must have
σ , L v = M σ , v
i.e.,
∑
i
σi (L v)i = ∑
α
(M σ)α vα
i.e.,
∑
i
σi (∑
α
Liα vα ) = ∑
α
(∑
i
Mαi σi ) vα i.e.,
∑
i ∑ α
Liα σi vα = ∑
i ∑ α
Mαi σi vα , and this implies that Mαi
=
Liα (the matrix representing M = Lt is the transpose of the matrix representing L ).
SLIDE 16
Note: this demonstration is trivial if using matrix notations. Case s = L v . There must be two duality products:
ω , v = ωt v
;
σ , s = σt s
. The transpose operator, say M , will operate on some σ to give some ω : ω = M σ . Relation between M and L ? For any v and any σ one must have
σ , L v = M σ , v
i.e., σt (L v) = (M σ)t v = (σt Mt) v i.e., σt L v = σt Mt v , and this implies that Mt = L , i.e., that M = Lt .
SLIDE 17
Case s = L v ⇔ si = dz Li(z) v(z) . There must be two du- ality products:
ω , v =
- dz ω(z) v(z)
;
σ , s = ∑
i
σi si . The transpose operator, say M = Lt , will operate on some σ to give some ω : ω = M σ ⇔ ω(z) = ∑i Mi(z) σi . Relation between Mi(z) and Li(z) ? For any v and any σ one must have
σ , L v = M σ , v
i.e.,
∑
i
σi (L v)i =
- dz (M σ)(z) v(z)
i.e.,
∑
i
σi (
- dz Li(z) v(z) ) =
- dz (∑
i
Mi(z) σi ) v(z) i.e.,
∑
i
- dz Li(z) σi v(z) = ∑
i
- dz Mi(z) σi v(z)
,
⇒ Mi(z) = Li(z) (the two operators L and M = Lt have
SLIDE 18
the same kernels; when L operates, we make a sum (integral)
- ver z ; when Lt operates, we make a (discrete) sum over i .
SLIDE 19
Case s = L v ⇔ s(t) = ∑α Lα(t) vα . Please do it. Case s = L v ⇔ s(t) = dz L(t, z) v(z) . Please do it. Case s = L v ⇔ si(t, ϕ) = ∑α ∑β
- dz Liαβ(t, ϕ, z) vαβ(z) .
Answer: ω = Lt σ ⇔ ωαβ(z) = ∑
i
- dt
- dϕ Li
αβ(t, ϕ, z) σi(t, ϕ)
SLIDE 20
The transpose of the derivative operator The derivative operator D maps a space X of functions x =
{x(t)} into a space V of functions v = {v(t)} . It is defined
as v = D x
⇔
v(t) = dx dt (t) . It is obviously a linear operator. By definition, the transpose
- perator Dt maps the dual of V (with functions that we may
denote ω = {ω(t)} ) into the dual of X (with functions that we may denote χ = {χ(t)} ). We don’t need to be interested in the interpretation of the two spaces X∗ and V∗ . I want to prove that, excepted for some boundary condition (to be found), the derivative operator is antisymmetric, i.e., Dt = −D .
SLIDE 21
In other words, I want to prove the second of these two ex- pressions v = D x
⇔
v(t) = dx dt (t) χ = Dt ω
⇔
χ(t) = −dω dt (t) .
SLIDE 22
The elements of the two spaces X and V are functions of the same variable t . Therefore the duality products in the two spaces are here quite similar
χ , x X =
t2
t1
dt χ(t) x(t) ;
ω , v V =
t2
t1
dt ω(t) v(t)
SLIDE 23
By definition of the transpose of a linear operator, for any x and any ω , one must have
ω , D x = Dt ω , x
. Here, this gives
t2
t1
dt ω(t) (D x)(t) =
t2
t1
dt (Dt ω)(t) x(t) . To prove that if (D x)(t) = dx
dt (t) , then (Dt ω)(t) = − dω dt (t) ,
we must then prove that for any two functions x(t) and ω(t) ,
t2
t1
dt ω(t) dx dt (t) +
t2
t1
dt dω dt (t) x(t) = 0 .
SLIDE 24
So, it is true that for any two functions x(t) and ω(t) one has Ψ =
t2
t1
dt ω(t) dx dt (t) +
t2
t1
dt dω dt (t) x(t) = 0 ? One has Ψ =
t2
t1
dt ω(t) dx dt (t) +
t2
t1
dt dω dt (t) x(t)
=
t2
t1
dt
- ω(t) dx
dt (t) + dω dt (t) x(t)
- =
t2
t1
dt d dt
- ω(t) x(t)
- = ω(t2) x(t2) − ω(t1) x(t1)
. So, one only has Ψ = 0 (and, therefore, Dt = −D ) if all the functions in the two spaces X and V∗ satisfy the dual boundary condition ω(t2) x(t2) = ω(t1) x(t1) .
SLIDE 25
Dual boundary condition (recall): ω(t2) x(t2) = ω(t1) x(t1) . Example: If all the functions x(t) satisfy the zero initial condi- tion x(t1) = 0 , then, in order to have Dt = −D , all the func- tions in V∗ must satisfy the zero final condition ω(t2) = 0 . We shall always work with these dual boundary conditions. Exercise: Demonstrate that if all the functions on which the second derivative operator D2 operates satisfy the initial con- ditions of rest, x(t1) = 0 and dx
dt (t1) = 0 , and all the functions
α(t) on which its transpose (D2)t operates satisfy the final conditions of rest, α(t2) = 0 and dα
dt (t2) = 0 , then, the opera-
tor D2 is symmetric (D2)t = D2 .
SLIDE 26
Tangent linear mapping. Fréchet derivative. Let ϕ be a (possibly nonlinear) mapping from the linear space A into the linear space B . The elements (vectors) of the spaces A and B may be functions. The linear mapping Φ0 is the tangent linear mapping to the mapping ϕ at a0 if ϕ(a0 + δa) = ϕ(a0) + Φ0 δa + . . . (where the “dots” mean terms that are second order in δa or higher). The Fréchet derivative (at a0 ) of the mapping ϕ is the kernel of Φ0 .
SLIDE 27
Example: Let be b = ϕ(a)
⇔
b(t) =
- dz Q(t, z) ( a(z) )2
. One has ϕ(a0 + δa)(t) =
- dz Q(t, z) ( a0(z) + δa(z) )2
=
- dz Q(t, z) ( a0(z)2 + 2 a0(z) δa(z) + . . . )
=
- dz Q(t, z) a0(z)2 +
- dz 2 a0(z) Q(t, z) δa(z) + . . .
= ϕ(a0)(t) + (Φ0 δa)(t) + . . .
Therefore, the tangent linear mapping to the mapping ϕ at a0(z) is the linear mapping that to any function δa(z) asso- ciates the function δb(t) =
- dz 2 a0(z) Q(t, z)
- Φ0(t,z)
δa(z) .
SLIDE 28