INDAM intensive period Computational Methods for Inverse Problems in - - PowerPoint PPT Presentation
INDAM intensive period Computational Methods for Inverse Problems in - - PowerPoint PPT Presentation
INDAM intensive period Computational Methods for Inverse Problems in Imaging Conclusive Workshop, Como, Italy, August 16-18, 2018 (Numerical linear algebra, regularization in Banach spaces and) variable exponent Lebesgue spaces for
Related joint works:
- Image restoration, acceleration and preconditioning:
Paola Brianzi, Fabio Di Benedetto; Pietro Dell’Acqua; Marco Donatelli DIMA, Univ. Genova; DISAT, Univ. dell’Aquila; Univ. Insubria, Como.
- Microwave inverse scattering:
Alessandro Fedeli, Matteo Pastorino and Andrea Randazzo
- Dip. di Ingegneria DITEN, Univ. Genova.
- Remote Sensing:
Flavia Lenti, Serge Gratton; David Titley-Peloquin; Maurizio Migliaccio, Ferdinando Nunziata; Federico Benvenuto, Alberto Sorrentino CERFACS and ENSEEIHT, Universit´ e de Toulouse; Dep. of Bioresource Engineering, McGill University, Canada; Dip. per le Tecnologie, Univ. Napoli Parthenope; Dip. di Matematica (DIMA), Univ. Genova.
- Subsurface prospecting:
Gian Piero Deidda; Patricia Diaz De Alba, Caterina Fenu, Giuseppe Rodriguez
- Dip. di Ing. Civile, Dip. di Matematica e Informatica, Univ. Cagliari.
Introduction: linear equations and inverse problems in Hilbert vs Banach space settings
Inverse Problem By the knowledge of some “observed” data y (i.e., the effect), find an approximation of some model parameters x (i.e., the cause). Given the (noisy) data y ∈ Y , find (an approximation of) the unknown x ∈ X such that Ax = y where A : X − → Y is a known linear operator, and X,Y are two functional (here Hilbert or Banach) spaces. True image Blurred (noisy) image Restored image Inverse problems are usually ill-posed, they need regularization techniques.
Numerical linear algebra lives in Hilbert spaces A Hilbert space is a complete vector space endowed with a scalar product that allows (lengths and) angles to be defined and computed. In any Hilbert spaces, what happens in our Euclidean world (that is, in our part of universe ...) always holds:
- Pythagorean theorem:
if u⊥v, then u + v2 = u2 + v2
- Parallelogram identity:
u + v2 + u − v2 = 2(u2 + v2)
- Gradient of (the square of) the norm:
∇1
2u2 = u
The scalar product allows also for a natural definition of:
- orthogonal projection (best approximation) of u on v
P(u) = u, vv/v2
- SVD decomposition (or eigenvalues/eigenvectors dec.) of linear operators
(for separable Hilbert spaces, as any finite dimensional Hilbert space...)
Regularization: the “classical” framework in Hilbert spaces In general, all the regularization methods for ill-posed functional equations have been deeply investigated in the context of Hilbert spaces. Benefits: Any linear (or linearized) operator in Hilbert spaces can be decomposed into a set of eigenfunctions by using the conventional spectral theory. This way, convergence and regularization properties of any solving method can be analyzed by considering the behavior of any single eigenfunction (i.e., we can use the SVD decomposition ...). Drawback: Regularization methods in Hilbert spaces usually give rise to smooth (and
- ver-smooth) solutions. In image deblurring, regularization methods in Hilbert
spaces do not allow to obtain a good localization of the edges.
Regularization: the “recent” framework in Banach spaces More recently, some regularization methods have been introduced and in- vestigated in Banach spaces. In any Banach space, only distances between its elements can be defined and measured, but no scalar product (thus no “angle”) is defined. Benefits: Due to the geometrical properties of Banach spaces, these regularization methods allow us to obtain solutions endowed with lower over-smoothness, which result, as instance, in a better localization and restoration of the dis- continuities in imaging applications. Another useful property of the regular- ization in Banach space is that solutions are more sparse, that is, in general they can be represented by very few components. Drawback: The “Mathematics” is much more involving (the spectral theory cannot be used...). Convex analysis is required.
Regularization: Hilbert VS Banach spaces Hilbert spaces Banach spaces L2(Rn), Hs = W s,2 Lp(Rn), W s,p, p ≥ 1; BV Benefits Easier computation Better restoration (Spectral theory,
- f the discontinuities;
eigencomponents) Sparse solutions Drawbacks Over-smoothness Theoretically tricky (bad localization of edges) (Convex analysis required) Recap: In Banach spaces we have no scalar product (so, no orthogonal projection), no Pythagorean theorem, no SVD... The (sub-)gradient of (the square of) the norm is not linear, so that the least square solution (of a linear problem) is no more a linear problem... Examples: L1 for sparse recovery or Lp, 1 < p < 2, for edge restoration.
L2(R2) 1/2 ||x||2 L1.2(R2) 1/1.2 ||x||1.2 L5(R2) 1/5 ||x||5 ∇ ( 1/2 ||x||2
2 )
−5 5 −5 5
∇ ( 1/1.2 ||x||1.2
1.2 )
−5 5 −5 5
∇ ( 1/5 ||x||5
5 )
−5 5 −5 5
Regularization in Hilbert and Banach spaces I Iterative minimization algorithms Gradient-like methods: Landweber, CG II Iterative projection algorithms SIRT: Cimmino, DROP, ... III Preconditioning Trigonometric matrix algebra preconditioners IV Multi-parameter and Adaptive regularization The Variable exponent Lebesgue spaces
Regularization in Hilbert and Banach spaces I Iterative minimization algorithms
Iterative minimization algorithms In the variational approach, instead of solving straightly the operator equa- tion Ax = y, we minimize the residual functional H : X − → [0, +∞) H(x) = 1 rAx − yr
Y .
Basic iterative scheme: xk+1 = xk − λkΦA(xk, y) where the operator ΦA : X × Y − → X returns a value ΦA(xk, y) ≈ ∇H(xk) = ∇ 1 rAxk − yr
Y
- ,
that is, (an approximation of) the “gradient” of the functional 1
rAx − yr Y
in the point xk and λk > 0 is the step length. This way, the iterative schemes are all different generalizations of the basic gradient descent method.
The Landweber algorithm in Hilbert spaces In the conventional case (i.e., both X and Y are Hilbert spaces) we consider the least square functional H2(x) = 1 2Ax − y2
Y .
Direct computation of ∇H2 by chaining rule for derivatives (in Rn) ∇H2(x) =
- (∇1
2u2)|u=Ax−y
∗ J (Ax − y) ∗ = ((Ax − y)∗A)∗ = A∗(Ax − y) leads to the “simplest” iterative method: the Landweber algorithm xk+1 = xk − λA∗(Axk − y) Since H2 is convex, there exists a non-empty set of stationary points, i.e. ∇H2(x) = 0, which are all minimum points of H2. The constant step size λ ∈ (0, 2/A∗A) , yields H2(xk+1) < H2(xk) and also guarantees the convergence of the iterations.
Towards the Landweber algorithm in Banach spaces (I) xk+1 = xk − λA∗(Axk − y) Formally, A∗ is the dual operator of A, that is, the operator A∗ : Y ∗ − → X∗ such that y∗(Ax) = (A∗y∗)(x) , ∀x ∈ X and ∀y∗ ∈ Y ∗ , where X∗ and Y ∗ are the dual spaces of X and Y . If X and Y are Hilbert spaces, then X is isometrically isomorphic to X∗ and Y is isometrically isomorphic to Y ∗ (by virtue of Riesz Theorem), so that the operator A∗ : Y ∗ − → X∗ can be identified with A∗ : Y − → G. However, in general Banach spaces are not isometrically isomorph to their
- duals. This way, the Landweber iteration above is well defined in Hilbert
spaces (...only!) The key point: To generalize from Hilbert to Banach spaces we have to consider the so-called duality maps.
Towards the Landweber algorithm in Banach spaces (II) The computation of the (sub-)gradient of the residual functional requires the (sub-)gradient of the norm of the Banach space involved. Here the key tool is the so-called “duality map”, which maps a Banach space B with its dual B∗. Theorem (Asplund, Acta Math., 1968) Let B be a Banach space and let r > 1. The duality map JB
r is the (sub-)gradient of the convex functional
f : B − → R defined as f(u) = 1
rur B (with abuse of notation...):
JB
r = ∇f = ∇
1 r · r
B
- .
Again from chaining rule, the (sub-)gradient of the residual 1
rAx − yr Y ,
by means of the duality map JY
r , can be computed as follows
∇ 1 rAx − yr
Y
- = A∗JY
r (Ax − y) .
The Landweber method in Banach spaces Let r > 1 a fixed weight value. Let x0 ∈ X an initial guess (the null vector x0 = 0 ∈ X is often used in the applications), and set x∗
0 = JX r (x0) ∈ X∗.
For k = 0, 1, 2, . . . x∗
k+1 = x∗ k − λkA∗JY r (Axk − y) ,
xk+1 = JX∗
r∗ (x∗ k+1) ,
where r∗ is the H¨
- lder conjugate of r, that is, 1
r + 1 r∗ = 1,
and the step sizes λk are suitably chosen. Here the duality map JX
r
: X − → 2X∗ acts on the iterates xk ∈ X, and the duality map JX∗
r∗ : X∗ −
→ 2X∗∗ acts on the iterates x∗
k ∈ X∗: In order
to be well defined, it is only required that the space X is reflexive, that is X∗∗ is isometrically isomorph to X, so that JX∗
r∗ ⊆ X.
Landweber iterative method in Hilbert spaces A : X − → Y A∗ : Y − → X H2(x) = 1
2Ax − y2 Y
xk+1 = xk − λA∗(Axk − y) Landweber iterative method in Banach spaces A : X − → Y A∗ : Y ∗ − → X∗ Hr(x) = 1
rAx − yr Y
xk+1 = JX∗
r∗ (JX r xk − λkA∗JY r (Axk − y))
Some remarks: (i) Any duality map is in general nonlinear (and multi-valued...), so that, differing from the Hilbert space case, the Landweber method is not linear. (ii) In the Banach space Lp, p ∈ (1, +∞), by direct computation we have JLp
r (x) = xr−p p
|x|p−1sgn(x) JLp
r
is a non-linear, single-valued, diagonal operator , which cost O(n). JLp
r
does not increase the global numerical complexity O(n log n) of image deblurring with (FFT-based) structured matrices. Note that JL2
2
= I.
A convergence result for the Landweber method in Banach spaces Proposition [Sch¨
- pfer, Louis, Schuster, Inv. Prob., 2006]
Let X be a reflexive Banach space, and Y a (arbitrary) Banach space. Let y ∈ R(A) and let x† the minimum norm pseudo-solution of Ax = y. If λk > 0 is suitably (...) chosen for all k, then the sequence of the iterations (xk) converges strongly to x†, that is, lim
k− →+∞ xk − x†X = 0
If the data y is noisy, an early stop of the iterations (by discrepancy prin- ciple) gives rise to an iterative regularization method. In Hilbert setting, the iterations are defined in the (primal) space X. In Banach setting, the iterations are defined in the dual space X∗ and are linked to the (“wide”...) Banach fixed point theory.
Duality maps are the basic tool for generalizing classical iterative methods for linear systems to Banach spaces: Landweber method, CG, Mann itera- tions, Gauss-Newton Gradient type iterations (for nonlinear problems). Basic hypotheses for the convergence (and regularization behavior): uni- formly smooth and (uniformly) convex Banach spaces. See also: Scherzer, Kaltenbacher, Fornasier, Hofmann, Kazimierski, P¨
- schl, Hein, Q.
Jin, Tautenhahn, Neubauer, Tomba, . . . . To reduce over-smoothness, these methods have been implemented in the context of Lp Lebesgue spaces with 1 < p ≤ 2. p >≈ 1 Low “regularization” Good recovery of edges and discontinuities Improve the sparsity p ≈ 2 High “regularization” Higher stability Over-smoothness
A numerical evidence in Lp Lebesgue space, 1 < p ≤ 2 The classical 2D Image restoration, with Landweber method (200 iters)
True image x PSF (A) Blurred and noisy image y Hilbert Restoration (p = 2) Banach Restoration (p = 1.5) Banach Restoration (p = 1.2)
A classical image restoration example Convergence history: Relative restoration errors x−xk2
2
x2
2
versus iteration index k
The Conjugate Gradient in Lp Banach spaces The Landweber algorithm in Banach spaces setting gives better restora- tions but it is still slow. Instead of −∇(H(xk)) , we consider a different anti-gradient descent direction, based on the same idea of the classical CG. The CG(NR) method for min. of H(x) = 1
2Ax − y2 2 , in Hilbert spaces:
xk+1 = xk + αkpk pk+1 = −∇H(xk+1) + βkpk = −A∗(Axk+1 − y) + βkpk where p0 = −∇H(x0) = −A∗(Ax0 − y), and αk = (Apk)T (y − Axk) (Apk)T (Apk) , βk = ∇H(xk+1)T∇H(xk+1) ∇H(xk)T∇H(xk) . The step size αk is called optimal, since α := αk solves the linear equation d dαH(xk + αpk) = 0 .
Some facts about the conventional CG method. Definition: xk is said to be optimal with respect to the direction v if H(xk) ≤ H(xk + λv) ∀λ ∈ R. By virtue of the optimal choice of the step length αk, the new iteration xk+1 = xk + αkpk is optimal with respect to the direction pk. The key idea of the CG method is that the new iteration xk+1 must be op- timal w.r.t. the (previous) direction pk−1 too. This leads to the computation
- f the special coefficient βk (i.e, the so-called Fletcher-Reeves formula), and
it can be proven that xk+1 is automatically optimal w.r.t. all the previous directions p0, p1, . . . , pk−2. The optimality of xk+1 w.r.t. the previous descent direction pk−1 is im- portant to achieve (finite step) convergence. Indeed, in the two dimensional Hilbert (i.e., L2 or Euclidean...) case, all the points which are optimal with respect to a fixed direction lie on a line passing through the minimum point.
−4 −2 2 4 6 8 10 −5 −4 −3 −2 −1 1 2 3 4 5 −4 −2 2 4 6 8 10 −5 −4 −3 −2 −1 1 2 3 4 5 −4 −2 2 4 6 8 10 −5 −4 −3 −2 −1 1 2 3 4 5
L2(Rn) L1.2(Rn) L5(Rn)
The Conjugate Gradient method in Lp Banach spaces The CG(NR) method for min. of H(x) = 1
pAx − yp Y , in Y = Lp:
x∗
k+1 =
x∗
k + αkp∗ k
xk+1 = JX∗
p∗ (x∗ k+1)
p∗
k+1 = −∇H(xk+1) + βkp∗ k = −A∗JY p (Axk+1 − y) + βkp∗ k
where p∗
0 = −∇H(x0) = −A∗JY p (Ax0 − y),
the step size αk solves (with approximation) the nonlinear equation d dαH
- JX∗
p∗ (x∗ k + αp∗ k)
- = 0 ,
and the coefficient βk satisfies a “Fletcher-Reeves”-like formula βk = γ||Axk+1 − y||p
p
||Axk − y||p
p
with γ < 1/2
Convergence of CG in Lp Banach spaces Theorem [E., Gratton, Lenti, Titley-Peloquin; Num. Math., 2018] (...) the sequence {xk}k of the CG method in Lp converges strongly to the minimum norm pseudo-solution x† of Ax = y, i.e. xk − x†p − → 0 (k − → +∞) . If the data y is noisy, an early stop of the iterations (by discrepancy prin- ciple) gives rise to an iterative regularization method. Remark: Differing from the conventional CG in Hilbert spaces, NO finite (n steps) convergence in Rn. However, Banach space finite steps convergence still holds in a simplified setting [Herzog R., Wollner W.; J. inverse ill-posed prob., 2017].
L2 L1.5 L1.2 (Relative Errors vs iterations
- Red: CG;
Magenta: Landweber) Some comments about the results: For p = 2 the CG method is too fast (i.e., not enough regularization), and does not provide the same good quality of the slowest Landweber. For the smaller p = 1.5, 1.2 the CG decelerates, and it gives the same quality of restoration give by the Landweber method, in much less iterations.
Regularization in Hilbert and Banach spaces II Iterative projection algorithms
Iterated projections (Row Action Methods) The points ˜ x satisfying the i-th row equation ai, ˜ x = bi of the m × n linear system Ax = b define an hyperplane Qi = {˜ x ∈ Rn : ai, ˜ x = bi} The solution x of the linear systems belongs to all the hyperplanes Qi, for i = 1, . . . , n. In Hilbert spaces, BECAUSE
- the orthogonal projection Pi(z) of a point z onto one hyperplane Qi
is easy to perform Pi(z) = z + bi − ai, z ai2
2
aT
i ,
- and the projection Pi(z) is closer to the solution x than z itself,
THEN iterative projecting xk+1 = Pi(xk) onto different hyperplanes Qi gives rise to a “low-cost” sequence (xk)+∞
k=1 converging to the solution x.
Some families of projection methods (in Hilbert spaces)
- ART (Algebraic Reconstruction Techniques) or Kaczmarz‘s methods
xk+1 = xk + λk bi − ai, xk ai2 aT
i
where the row index i depends on the iteration index k (in many ways, for instance i = k mod m, where m is the number of rows of A, or even randomly (!) ) and λk ∈ R is a relaxation parameter.
- SIRT (Simultaneous Iterative Reconstruction Techniques)
The same “fixed” iteration xk is used (i.e., simultaneously) for a “complete” set of m different projections, that is, for a full set of indexes i = 1, 2, . . . , m. In this case, a matrix form iteration holds xk+1 = xk + λkSA∗M(b − Axk) SIRT: Cimmino method, DROP (Diagonally Relaxed Orthogonal Projec- tion), CAV (Component Averaging), BIP (Block-Iterative Projections), ...
Cimmino and DROP methods In the SIRT family xk+1 = xk + λkSA∗M(b − Axk) , we have:
- the Cimmino method (1938), where the new iteration is the average
- f a complete set of m projections
xk+1 = 1 m
m
- i=1
Pi(xk) = xk + 1 m
m
- i=1
bi − ai, xk ai2 aT
i
so that S = I, M = 1 mD = 1 mdiag(1/a12, 1/a22, . . . , 1/am2).
- the DROP method (Diagonally Relaxed Orthogonal Projection), where
S = diag(1/r1, 1/r2, . . . , 1/rn) , M = D = diag(1/a12, 1/a22, . . . , 1/am2) , being rj the number of non-zeroes elements of the j-th column of A.
−4 −2 2 4 6 8 10 −5 −4 −3 −2 −1 1 2 3 4 5
P1(x0) P2(x0) (b1 − <a1,x0>) a1 / ||a1||2 x0 x1 x1 = [P1(x0)+P2(x0)] / 2 Pi(x) = x + (bi − <ai,x>) ai / ||ai||2 (b2 − <a2,x0>) a2 / ||a2||2 −2x+(4/3)y=8/3 a2=(−2,4/3); b2=8/3; −x+3y=8 a1=(−1,3); b1=8;
DROP method (in Hilbert spaces) A : X − → Y A∗ : Y − → X HM(x) = 1
2Ax − y2 M
xk+1 = S(S−1xk − λkA∗M(Axk − y)) Landweber iterative method in Banach spaces A : X − → Y A∗ : Y ∗ − → X∗ Hr(x) = 1
rAx − yr Y
xk+1 = JX∗
r∗ (JX r xk − λkA∗JY r (Axk − y))
Some remarks about JY
r , JX∗ r∗ (in Lp Banach spaces) VS M, S (in L2 Hilbert spaces):
- JY
r is non-linear and M is linear; JX∗ r∗ is non-linear and S is linear;
- all are diagonal and positive operators;
- all cost O(n) operations;
- the action of the matrix M in DROP is “similar” to the one of the duality
map JY
r = JLp 2
with 1 < p < 2 in Landweber-Banach.
DROP method (in Hilbert spaces) A : X − → Y A∗ : Y − → X HM(x) = 1
2Ax − y2 M
xk+1 = S(S−1xk − λkA∗M(Axk − y)) Landweber iterative method in Banach spaces A : X − → Y A∗ : Y ∗ − → X∗ Hr(x) = 1
rAx − yr Y
xk+1 = JX∗
r∗ (JX r xk − λkA∗JY r (Axk − y))
In summary, the Landweber iterative method in Banach spaces can also be viewed as a non-linear generalization of well-known projection algorithms for linear systems. The oblique geometry of the M-induced norm · M in Hilbert space is replaced by the Lp-norm · Lp in Lp Banach space setting.
Regularization in Hilbert and Banach spaces III Preconditioning
Preconditioning in Banach Space Preconditioned system in Hilbert space, with (invertible) preconditioner D A∗Ax = A∗y ⇐ ⇒ DA∗Ax = DA∗y Preconditioned Landweber method in Hilbert spaces xk+1 = xk − λDA∗(Axk − y) where the preconditioner D is a regularized (polynomial, rational, circulant) “low-cost” approximation (a structured and/or sparse matrix) of the inverse
- f A∗A, i.e.,
D ≈ (A∗A)−1 . Now in Banach space A∗A is not well defined (. . . it cannot even be written!) We generalized preconditioning schemes to Banach space in two ways
- Primal preconditioning
- Dual preconditioning
[Brianzi, Di Benedetto, E., Comp. Opt. Appl., 2013]
Primal Preconditioning in Banach Space Remember, in Hilbert space: xk+1 = xk − λDA∗(Axk − y) with D ≈ (A∗A)−1 Primal preconditioner : D : X − → X The Primal preconditioner is a regularized approximation of (JX∗
r∗ A∗JY r A)−1
From Hilbert space: xk+1 = D(D−1xk − λA∗(Axk − y)) To Banach space: xk+1 = DJX∗
r∗ (JX r D−1xk − λA∗JY r (Axk − y))
Recalling that A : X − → Y , A∗ : Y ∗ − → X∗ , JX
r : X −
→ X∗ , JX∗
r∗ : X∗ −
→ X , JY
r : Y −
→ Y ∗ , the iterations are well-defined.
Dual Preconditioning in Banach Space Remember, in Hilbert space: xk+1 = xk − λDA∗(Axk − y) with D ≈ (A∗A)−1 . Dual preconditioner : D = X∗ − → X∗ The Dual preconditioner is a regularized approximation of (A∗JY
r AJX∗ r∗ )−1.
From Hilbert space: xk+1 = xk − λDA∗(Axk − y) To Banach space: xk+1 = JX∗
r∗ (JX r xk − λDA∗JY r (Axk − y))
Recalling that A : X − → Y , A∗ : Y ∗ − → X∗ , JX
r : X −
→ X∗ , JX∗
r∗ : X∗ −
→ X , JY
r : Y −
→ Y ∗ , the iterations are well-defined.
Convergence analysis of preconditioning in Lp Banach Space These preconditioned algorithms can be read as fixed-point iterations in the dual space X∗, that is, x∗
k+1 = g(x∗ k) ,
where g : X∗ − → X∗ in the dual preconditioning is g(t) = t − λDA∗JY
r (AJX∗ r∗ (t) − y) .
By simple direct computation, its Jacobian Jg in Lp Banach spaces is Jg(x∗) = I − λDA∗A+E , where E is a matrix of small norm for p close to (the Hilbert value) p = 2. This way, by a Bauer-Fike argument, the eigenvalues of Jg(x∗) are “per- turbations” of those of the iteration matrix I − λDA∗A of the conventional Hilbert case. It follows that, for p ≈ 2, the spectral radius of Jg(x∗) is essentially the same as that of the Hilbert case. In particular, choosing D as an optimal approximation of (A∗A)† will bring to an acceleration of the convergence, at least for p ≈ 2.
(Regularized) Preconditioning for image restoration Preconditioner: Thikonov-filtered T. Chan optimal circulant preconditioner.
True image x NO Prec. 200 Iter. (p = 1.5) 200 Iter. Rel. Err 0.4182 Tikhonov Prec. (p = 1.8) Tikhonov Prec. 33 Iter.(p = 1.5) 93 Iter. Rel. Err 0.3482 33 Iter. Rel. Err 0.3383
5 10 15 20 25 30 35 40 45 50 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
p=1.5; Tikh. filter alpha=0.02 p=1.5; NO Preconditioner p=2.0 Tikh. filter alpha=0.02
Regularization in (Hilbert and) Banach spaces IV Multi-parameter and Adaptive regularization
A “new” framework: variable exponent Lebesgue spaces Lp(·) In image restoration, often different regions of the image require different “amount of regularization”. Setting different levels of regularization is useful because background, low intensity, and high intensity values require different filtering levels [Nagy, Pauca, Plemmons, Torgersen, J Opt Soc Am A, 1997]. The idea: the ill-posed functional equation Af = g is solved in Lp(·) Banach spaces, namely, the variable exponent Lebesgue spaces, a special case of the so-called Musielak-Orlicz functional spaces (first proposed in two seminal papers in 1931 and 1959, but intensively studied just in the last 10 years). In a variable exponent Lebesgue space, to measure a function f, instead
- f a constant exponent p all over the domain, we have a pointwise variable
(i.e., a function) exponent 1 ≤ p(·) ≤ +∞. This way, different values of the exponent, i.e. different kinds of regulariza- tion, on different regions of the image to restore, can be automatically and adaptively assigned.
A sketch on variable exponent Lebesgue spaces Lp(·) Lp(Ω) Lp(x)(Ω) 1 ≤ p ≤ ∞ p(x) : Ω → [1, ∞] p is constant p(x) is a measurable function fp =
Ω |f(x)|p dx
1/p fp(·) =
Ω |f(x)|p(x) dx
1/??? f∞ = ess sup |f(x)| . . . f ∈ Lp(Ω) ⇐ ⇒
- Ω |f(x)|p dx < ∞
f ∈ Lp(·)(Ω) ⇐ ⇒ ??? In the following, Ω∞ = {x ∈ Ω : p(x) = ∞} has zero measure.
Ω = [−5, 5] p(x) =
- 2
se − 5 ≤ x ≤ 0 3 se 0 < x ≤ 5
Ω = [−5, 5] p(x) =
- 2
se − 5 ≤ x ≤ 0 3 se 0 < x ≤ 5 f(x) =
1 |x−1|1/3 /
∈ Lp(·)([−5, 5])
Ω = [−5, 5] p(x) =
- 2
se − 5 ≤ x ≤ 0 3 se 0 < x ≤ 5 f(x) =
1 |x+1|1/3 ∈ Lp(·)([−5, 5])
f(x) =
1 |x−1|1/3 /
∈ Lp(·)([−5, 5])
The norm of variable exponent Lebesgue spaces In the conventional case Lp, the norm is fLp =
Ω |f(x)|pdx
1/p . In Lp(·) Lebesgue spaces, the definition and computation of the norm is not straightforward, since we have not a constant value for computing the (“mandatory”) radical. fLp(·) =
Ω
|f(x)|p(x)dx 1/??? . The solution: compute first the modular (for 1 ≤ p(·) < +∞) ̺(f) =
- Ω
|f(x)|p(x)dx , and then obtain the (so called Luxemburg [1955]) norm by solving a 1D minimization problem fLp(·) = inf
- λ > 0 : ̺
f λ
- ≤ 1
- .
The elements of a variable exponent Lebesgue space ̺(f) =
- Ω
|f(x)|p(x)dx , fLp(·) = inf
- λ > 0 : ̺
f λ
- ≤ 1
- .
The Lebesgue space Lp(·)(Ω) =
- f : Ω → R | fLp(·) < ∞
- is a Banach space.
In the case of a constant function exponent p(x) = p, this norm is exactly the classical one fp, indeed ̺ f λ
- =
- Ω
|f(x)| λ p dx = 1 λp
- Ω
|f(x)|pdx = 1 λpfp
p
and inf
- λ > 0 : 1
λpfp
p ≤ 1
- = fp
Modulus VS Norm In (classical) Lp, norm and modulus are “the same” apart from a p-root: fp < ∞ ⇐ ⇒
- Ω
|f(x)|p dx < ∞ In Lp(·), norm and modulus are really different: fp(·) < ∞ ⇐ ⇒ ̺(f) < ∞ Indeed, the following holds fp(·) < ∞ ⇐ ⇒ there exist a λ > 0 s.t. ̺ f λ
- < ∞
(and notice that λ can be chosen large enough . . . ).
A simple example of a strange behavior Ω = [1, ∞) f(x) ≡ 1 p(x) = x ———– ̺(f) = ∞
1
1x dx = ∞ BUT fp(·) ≃ 1.763 Indeed ̺ f λ
- =
∞
1
1 λ x dx = 1 λ log λ < ∞, for any λ > 1
The vector case: the Lebesgue spaces of sequences lp(·) = (lpn
n )
The unit circle of x = (x1; x2) in R2 with variable exponents p = (p1; p2).
xp xp(·) Inclusion if (p1, p2) ≥ (q1, q2) (as classical) No inclusion in general
Properties of variable exponent Lebesgue spaces Lp(·) Let p− = ess inf
Ω |p(x)| , and
p+ = ess sup
Ω
|p(x)| . If p+ = ∞ , then Lp(·)(Ω) is a “bad” (although very interesting) Banach space, with poor geometric properties (i.e., it is not useful for our methods). If 1 < p− ≤ p+ < ∞ , then Lp(·)(Ω) is a “good” Banach space, since many properties of classical Lebesgue spaces Lp still hold: Lp(·) is uniformly smooth, uniformly convex, and reflexive; we know the duality map of Lp(·). Its dual space is well defined,
- Lp(·)∗
≃ Lq(·), where
1 p(x) + 1 q(x) = 1 ,
Lg(f) =
- Ω
f(x)g(x) dx ∈
- Lp(·)∗
⇐ ⇒ g ∈ Lq(·)(Ω) . Unfortunately,
- Lp(·)∗
and Lq(·) are isomorphic but not isometric Lg(Lp(·))∗ = gLq(·) . We know the duality map of Lp(·), but don’t know the duality map of (Lp(·))∗.
The duality map of the variable exponent Lebesgue space By extending the duality maps, we could define into Lp(·) all the iterative methods developed in Lp (Landweber, Steepest descent, CG, Mann iter.). For any constant 1 < r < +∞, we recall that the duality map, that is, the (sub-)differential of the functional 1
rfr Lp , in the classical Banach space Lp,
with constant 1 < p < +∞, is defined as follows
- JLp(f)
- (x) = |f(x)|p−1 sgn(f(x))
fp−r
p
. By generalizing a result of P. Matei [2012], we have that the corresponding duality map in variable exponent Lebesgue space is defined as follows
- JLp(·)(f)
- (x) =
1
- Ω
p(x) |f(x)|p(x) fp(x)
p(·)
dx p(x) |f(x)|p(x)−1 sgn(f(x)) fp(x)−r
p(·)
, where any product and any ratio have to be considered as pointwise.
The adaptive algorithm in variable exponent Lebesgue spaces It is a numerical evidence that, in Lp image deblurring,
- dealing with small 1 ≈ p << 2 improves sparsity and allows a better
restoration of the edges of the images and of the zero-background,
- dealing with p ≈ 2 (even p > 2), allows a better restoration of the regions
- f pixels with the highest intensities.
The idea: to use a scaled into [1, 2] version of the (re-)blurred data y as distribution of the exponent p(·) for the variable exponent Lebesgue spaces Lp(·) where computing the solution. Example (linear interpolation): p(·) = 1 + [A∗y(·) − min(A∗y)]/[max(A∗y) − min(A∗y)] The Landweber (i.e., fixed point) iterative scheme in this Lp(·) Banach space can be modified as adaptive iteration algorithm, by recomputing, after each fixed number of iterations, the exponent distribution pk(·) by means of the k-th restored image xk (instead of the first re-blurred data A∗y), that is pk(·)= 1 + [xk(·)− min(xk)]/[max(xk) − min(xk)]
The conjugate gradient method in Lp(·) for image restoration Let p(·) = 1 + [A∗y(·) − min(A∗y)]/[max(A∗y) − min(A∗y)] φ∗
0 = −A∗JLp(·) r
(Ax0 − y) For k = 0, 1, 2, . . . λk = arg minα 1
rA(xk + αφk) − yr Lp(·)
x∗
k+1 = x∗ k + λkφ∗ k
xk+1 = J(Lp(·))∗
s′
(x∗
k+1)
βk+1 = γ
Axk+1−yr
Lp(·)
Axk−yr
Lp(·)
with γ < 1/2 φ∗
k+1 = −A∗JLp(·) r
(Axk+1 − y) + βk+1φ∗
k
and recompute p(·) = 1 + [xk(·) − min(xk)]/[max(xk) − min(xk)] each m iterations by using the last iteration xk . [Proof of convergence of CG for Lp (constant) Lebesgue spaces in 2017].
Numerical results
True image Point Spread Function Blurred image (noise = 4.7%)
Special thanks to Brigida Bonino (Univ. Genova).
True image Blurred (noise = 4.7%) p = 2 (0.2692) p = 1.3 (0.2681) p = 1.3 − 1.6 (0.2473) 1.3 − 1.6 and irreg. (0.2307)
True image Point Spread Function Blurred image (noise = 4.7%)
A remote sensing application: spatial resolution enhancement of microwave radiometer data It is an under-determined problem (with a special structured matrix), since we want to reconstruct the High frequency components reduced by the Low Pass filter of the radiometer. Courtesy: monde-geospatial.com Joint work with: Matteo Alparone, Flavia Lenti, Maurizio Migliaccio, Nando Nunziata (Dep. of Engineering, Univ. Napoli Parthenope)
200 400 600 800 1000 1200 1400
- 50
50 100 150 200 250 200 400 600 800 1000 1200 1400
- 50
50 100 150 200 250
p=2 p=1.2
200 400 600 800 1000 1200 1400
- 50
50 100 150 200 250
Variable p
pmin = 1.2; pmax = 2; lambda = 0.3; maxiter = 2000; eThr = 0.12; fKind = 1;
200 400 600 800 1000 1200 1400
- 50
50 100 150 200 250 200 400 600 800 1000 1200 1400
- 50
50 100 150 200 250
RECT E DOUBLE RECT
200 400 600 800 1000 1200 1400
- 50
50 100 150 200 250
Variable p
pmin = 1.2; pmax = 2; lambda = 0.3; maxiter = 2000; eThr = 0.12; fKind = 1;
10 20 30 40 50 60 70 100 150 200 250 300
Signal
500 1000 1500 50 100 150 200 250 300 350 400
Res En Constant p = 1.2
xconst TB37
500 1000 1500 50 100 150 200 250 300 350 400 450
Res En variable p
xvar TB37
An inverse scattering application: microwave tomography Microwave imaging is a noninvasive and nondestructive technique to inspect materials by using incident electromagnetic waves generated in the microwave range (300 MHz - 300 GHz). It is a nonlinear, implicit and ill-posed problem. Joint work with: Alessandro Fedeli, Matteo Pastorino, Andrea Randazzo (Dep. of Engi- neering, Univ. Genoa)
1
𝑀2 𝑀1.2 𝑀𝑞(1.4−2)
𝜗𝑠 = 2, SNR=15dB
𝜗𝑠 = 3, SNR=30dB
2
Thank you for your attention
References
[1] F. Sch¨
- pfer, A.K. Louis, T. Schuster, Nonlinear iterative methods for linear ill-posed problems in Banach
spaces, Inverse Problems, vol. 22, pp. 311–329, 2006. [2] M. O. Scherzer, M. Grasmair, H. Grossauer, M. Haltmeier, F. Lenzen, Variational Methods in Imaging, Springer, Berlin, 2008. [3] Schuster, T., Kaltenbacher, B., Hofmann, B., and Kazimierski, K. S., Regularization Methods in Banach
- Spaces. Radon Series on Computational and Applied Mathematics, vol. 10, De Gruyter, 2012.
[4] L. Diening, P. Harjulehto, P. H¨ ast¨
- , M. Ruzicka, Lebesgue and Sobolev Spaces with Variable Exponents,