Regularization Theory Nicolas Rougon Institut Mines-Tlcom / Tlcom - - PowerPoint PPT Presentation

regularization theory
SMART_READER_LITE
LIVE PREVIEW

Regularization Theory Nicolas Rougon Institut Mines-Tlcom / Tlcom - - PowerPoint PPT Presentation

Direct & Inverse problems Restoring well-posedness Regularization Regularization Theory Nicolas Rougon Institut Mines-Tlcom / Tlcom SudParis ARTEMIS Department; CNRS UMR 8145 nicolas.rougon@telecom-sudparis.eu May 10, 2020


slide-1
SLIDE 1

Direct & Inverse problems Restoring well-posedness Regularization

Regularization Theory

Nicolas Rougon

Institut Mines-Télécom / Télécom SudParis ARTEMIS Department; CNRS UMR 8145 nicolas.rougon@telecom-sudparis.eu

May 10, 2020

Nicolas Rougon IMA4509 - Regularization theory

slide-2
SLIDE 2

Direct & Inverse problems Restoring well-posedness Regularization

Course overview

◮ Basic problems in low-level image analysis are hard to solve ◮ The encountered difficulties have common nature and origins, linked to the key notion of ill-posed inverse problem ◮ Generic mathematical approaches for fixing these difficulties have been developed in the deterministic and stochastic frameworks. They are known as regularization techniques ◮ In the next courses, deterministic regularization will be applied to 3 basic image analysis problems Segmentation Active contours Mumford-Shah Denoising Anisotropic diffusion Motion estimation Optical flow

Nicolas Rougon IMA4509 - Regularization theory

slide-3
SLIDE 3

Direct & Inverse problems Restoring well-posedness Regularization Problem statement Well-posedness issues Direct & Inverse problems in digital imaging

Outline

1

Direct & Inverse problems

2

Restoring well-posedness

3

Regularization

Nicolas Rougon IMA4509 - Regularization theory

slide-4
SLIDE 4

Direct & Inverse problems Restoring well-posedness Regularization Problem statement Well-posedness issues Direct & Inverse problems in digital imaging

An example - Surface reconstruction

3D sensor 3D point set 3D object

?

Problem statement Given a 3D point set, estimate the geometry of the surface (i.e. the shape) of the underlying 3D object ◮ Solving for an interpolation problem over R3

Nicolas Rougon IMA4509 - Regularization theory

slide-5
SLIDE 5

Direct & Inverse problems Restoring well-posedness Regularization Problem statement Well-posedness issues Direct & Inverse problems in digital imaging

An example - Surface reconstruction

◮ A hard problem Topology is lost → connectivity? homotopy? Geometry is lost → metrics? orientation? ր under-constrained problem ⇒ multiple solutions ց ◮ Solution: enforcing prior constraints Topological constraints

connectivity homotopy boundary (open/close)

Geometric constraints

continuity (smoothness) support (global/patch) subspace

Nicolas Rougon IMA4509 - Regularization theory

slide-6
SLIDE 6

Direct & Inverse problems Restoring well-posedness Regularization Problem statement Well-posedness issues Direct & Inverse problems in digital imaging

An example - Surface reconstruction

◮ A hard problem (cont’d) Discretization and noise generate ambiguities

reconstruction sampling

Assume subspace constraints (e.g. circle shape space)

  • ver-constrained problem

⇒ solutions may not exist noise-sensitive problem ⇒ solutions may vary radically with noise ◮ Solution: defining an approximated problem

Nicolas Rougon IMA4509 - Regularization theory

slide-7
SLIDE 7

Direct & Inverse problems Restoring well-posedness Regularization Problem statement Well-posedness issues Direct & Inverse problems in digital imaging

Direct vs. inverse problems

H y x

◮ Imaging system model H : X → Y H continuous (linear | nonlinear) operator X, Y Hilbert spaces norm · X , dot product < ·, · >X

H y x data solutions

◮ Direct problem Given an input x ∈ X, generate y ∈ Y such that y = H x → Image synthesis

H y data x solutions

◮ Inverse problem Given an output y ∈ Y , estimate x ∈ X such that y = H x → Image analysis

Nicolas Rougon IMA4509 - Regularization theory

slide-8
SLIDE 8

Direct & Inverse problems Restoring well-posedness Regularization Problem statement Well-posedness issues Direct & Inverse problems in digital imaging

Well-posed vs. ill-posed problems

◮ A (direct/inverse) problem is well-posed iff the following so-called Hadamard conditions hold for any data y ∈ Y , there exists a solution x ∈ X the solution x ∈ X is unique the solution x ∈ X depends continuously on data y ∈ Y

δ

x

y + δ

y

x +

X

y x

Y

◮ A problem is ill-posed iff. at least one Hadamard condition fails

Nicolas Rougon IMA4509 - Regularization theory

slide-9
SLIDE 9

Direct & Inverse problems Restoring well-posedness Regularization Problem statement Well-posedness issues Direct & Inverse problems in digital imaging

Direct problems

◮ Since H is continuous, classical direct problems in mathematics, physics and engineering are well-posed Physical simulation problems Often defined by basic partial differential equations (PDE) with proper boundary conditions (BC)

propagation → elliptic PDE + Dirichlet BC e.g. wave equation diffusion → parabolic PDE + Neumann BC e.g. heat equation transport → hyperbolic PDE + Cauchy BC e.g. eikonal (Burger) equation

Image synthesis problems Most models are physically-inspired

Nicolas Rougon IMA4509 - Regularization theory

slide-10
SLIDE 10

Direct & Inverse problems Restoring well-posedness Regularization Problem statement Well-posedness issues Direct & Inverse problems in digital imaging

Inverse problems in image analysis

◮ Inverse problems in image science are usually ill-posed Segmentation | Surface reconstruction | Shape from X

Edge detection H is an integral operator (Hx)(η) = η

−∞

x(ξ) dξ η : pixel location

10 20 30 40 50 60 70 80 10 20 30 40 50 60 70 80 200 400

Surface reconstruction H is a projector onto a local basis (Hx)(ηi) = < x, ϕi >X ηi : surface local coordinates

→ elliptic PDE

Nicolas Rougon IMA4509 - Regularization theory

slide-11
SLIDE 11

Direct & Inverse problems Restoring well-posedness Regularization Problem statement Well-posedness issues Direct & Inverse problems in digital imaging

Inverse problems in image analysis

◮ Inverse problems in image science are usually ill-posed Matching H is a warping operator (Hx)(η) = (x ◦ φ)(η) η : pixel location

Motion estimation φ = Id + v where v : optical flow

η1 η2

Stereovision φ = Id + d where d : disparity

→ Optimal isomorphism

Nicolas Rougon IMA4509 - Regularization theory

slide-12
SLIDE 12

Direct & Inverse problems Restoring well-posedness Regularization Problem statement Well-posedness issues Direct & Inverse problems in digital imaging

Inverse problems in image analysis

◮ Inverse problems in image science are usually ill-posed Deconvolution H is a convolution operator (Hx)(η) = (K ⋆ x)(η) η : pixel location

Deblurring Linear scale-space K : Gaussian kernel

→ Inverse diffusion

Nicolas Rougon IMA4509 - Regularization theory

slide-13
SLIDE 13

Direct & Inverse problems Restoring well-posedness Regularization Problem statement Well-posedness issues Direct & Inverse problems in digital imaging

Inverse problems in image analysis

◮ Inverse problems in image science are usually ill-posed Transmission tomography reconstruction

η D η

H is a Radon transform (Hx)(η) =

x(l) dl η : projection angle

Computerized Tomography (CT)

→ Inverse Radon transform

Nicolas Rougon IMA4509 - Regularization theory

slide-14
SLIDE 14

Direct & Inverse problems Restoring well-posedness Regularization Problem statement Well-posedness issues Direct & Inverse problems in digital imaging

Inverse problems in image analysis

◮ Inverse problems in image science are usually ill-posed Emission tomography reconstruction

Positon Emission Tomography (PET) Single Photon Emission Computerized Tomography (SPECT)

→ Inverse Fredholm integral Super-resolution

Satellite / Aerial / Astromonical imaging

→ Interpolation from multiple data Phase unwarping

Synthetic Aperture Radar (SAR) Digital Holographic Microscopy (DHM)

→ Analytic continuation ...

Nicolas Rougon IMA4509 - Regularization theory

slide-15
SLIDE 15

Direct & Inverse problems Restoring well-posedness Regularization Problem statement Pseudo-inverse Generalized inverses

Outline

1

Direct & Inverse problems

2

Restoring well-posedness

3

Regularization

Nicolas Rougon IMA4509 - Regularization theory

slide-16
SLIDE 16

Direct & Inverse problems Restoring well-posedness Regularization Problem statement Pseudo-inverse Generalized inverses

Problem statement

Inverse problem (P) Given the data y ∈ Y , estimate x ∈ X such that y = Hx

Y X Sp(H ) Ker(H )

y y’ x x0 x+x0

The invertibility of H depends on its kernel Ker(H) ⊆ X (closed set) its span Sp(H) ⊆ Y ◮ The problem (P) is well-posed iff. H is injective (Ker(H) = {0}) and onto (Sp(H) = Y ). In this case H−1 exists and is continuous The solution of (P) is x = H−1y

Nicolas Rougon IMA4509 - Regularization theory

slide-17
SLIDE 17

Direct & Inverse problems Restoring well-posedness Regularization Problem statement Pseudo-inverse Generalized inverses

Restoring well-posedness

Y X Sp(H ) Ker(H )

y y’ x x0 x+x0

Assumptions H is not injective Ker(H) = {0} H is not onto Sp(H) = Y ◮ How to restore the existence and unicity properties? Key idea Redefine the problem (P) so that Ker(H) ∩ X = {0} Sp(H) = Y

Nicolas Rougon IMA4509 - Regularization theory

slide-18
SLIDE 18

Direct & Inverse problems Restoring well-posedness Regularization Problem statement Pseudo-inverse Generalized inverses

Restoring well-posedness

Track #1 Redefine the data space Y and the solution space X

Y X Ker(H )

y2 y1

Sp(H )

Assumptions H is not injective Ker(H) = {0} Sp(H) is closed Y = Sp(H) ⊕ Sp⊥(H) ◮ The problem (P) is well-posed over X = Ker⊥(H) Y = Sp(H)

Nicolas Rougon IMA4509 - Regularization theory

slide-19
SLIDE 19

Direct & Inverse problems Restoring well-posedness Regularization Problem statement Pseudo-inverse Generalized inverses

Restoring well-posedness

Track #1 Redefine the data space Y and the solution space X ◮ Bridging the gap from theory to practice is rarely possible since there is no generic algorithms to compute Ker⊥(H) test for y ∈ Sp(H) Track #2 Redefine the problem operator H to end up with a computable inverse operator

Nicolas Rougon IMA4509 - Regularization theory

slide-20
SLIDE 20

Direct & Inverse problems Restoring well-posedness Regularization Problem statement Pseudo-inverse Generalized inverses

Problem statement

Idea Find x ∈ X such that Hx is as close as possible to data y ∈ Y

Y X Sp(H ) Ker(H )

x x1 x2 y

^

Assumptions Sp(H) is closed ◮ This amounts to replacing the exact problem (P) by an approximated one addressing the latter in an optimization framework, involving a loss function and an optimizer

Nicolas Rougon IMA4509 - Regularization theory

slide-21
SLIDE 21

Direct & Inverse problems Restoring well-posedness Regularization Problem statement Pseudo-inverse Generalized inverses

Pseudo-solutions

Pseudo-inverse problem (ˆ P) Given data y ∈ Y , find a minimizer ˆ x ∈ X of the loss function ℓ(x) = Hx − yY ˆ x = arg min

x∈X Hx − yY

ˆ x is referred to as a pseudo-solution of (P) ℓ(x) is also termed as a cost / objective function When · Y is the L2-norm

ℓ(x) : mean square error ˆ x : least square solution

Nicolas Rougon IMA4509 - Regularization theory

slide-22
SLIDE 22

Direct & Inverse problems Restoring well-posedness Regularization Problem statement Pseudo-inverse Generalized inverses

Pseudo-inverse

◮ Let H∗ be the adjunct operator of H, defined by < Hx, y >Y = < x, H∗y >X Normal equations Pseudo-solutions ˆ x verify the normal equations H∗Hx = H∗y Special case of Euler equations ∂xℓ(x) = 0 Pseudo-inverse The pseudo-inverse of H is the operator ˆ H : Y → X defined by ˆ H = (H∗H)−1H∗ ˆ H is continuous

Nicolas Rougon IMA4509 - Regularization theory

slide-23
SLIDE 23

Direct & Inverse problems Restoring well-posedness Regularization Problem statement Pseudo-inverse Generalized inverses

Pseudo-inverse

◮ Proof: Assume H is a linear operator ⋆. Let x + ε δx be a variation

  • f x with small magnitude ε > 0 in some direction δx ∈ X

⋆ if not, linearize H locally: H(x + ε δx) ≈ Hx + ε ∂xH δx

ℓ(x + ε δx) = < H(x + ε δx) − y , H(x + ε δx) − y >Y = < (Hx − y) + ε Hδx , (Hx − y) + ε Hδx >Y = ℓ(x) + 2ε < Hx − y , Hδx >Y + o(ε2) = ℓ(x) + 2ε < H∗(Hx − y) , δx >X + o(ε2) Taking the limit lim

ε → 0

ℓ(x + ε δx) − ℓ(x) ε = < H∗Hx − H∗y , δx >X yields a directional derivative Dδxℓ(x) = < ∂xℓ(x) , δx >X, known as the Gâteaux derivative of ℓ(x) in the direction δx If x is a minimizer of ℓ, then Dδxℓ(x) = 0 for all δx ∈ X

  • Nicolas Rougon

IMA4509 - Regularization theory

slide-24
SLIDE 24

Direct & Inverse problems Restoring well-posedness Regularization Problem statement Pseudo-inverse Generalized inverses

Well-posedness

◮ Since ˆ H is always defined, pseudo-solutions always exist Let PH be a projector on Sp(H) The pseudo-solutions of (ˆ P) for data y and PHy are the same ◮ If H is injective The problem (ˆ P) is well-posed Its solution is ˆ x = ˆ Hy ◮ If H is not injective The problem (ˆ P) is under-constrained (multiple solutions)

Nicolas Rougon IMA4509 - Regularization theory

slide-25
SLIDE 25

Direct & Inverse problems Restoring well-posedness Regularization Problem statement Pseudo-inverse Generalized inverses

Problem statement

Idea Among all pseudo-solutions, retain the one with minimal norm

x

Y X

y x+ x+x0 ^ ^ x0

Ker(H ) Sp(H )

Assumptions Sp(H) is closed H is not injective

Nicolas Rougon IMA4509 - Regularization theory

slide-26
SLIDE 26

Direct & Inverse problems Restoring well-posedness Regularization Problem statement Pseudo-inverse Generalized inverses

Generalized solutions

Generalized inverse problem (P†) Given data y ∈ Y , find a minimizer x† ∈ X with minimal norm of the loss function ℓ(x) = Hx − yY x† =    arg min

x∈X Hx − yY

arg min

x∈X xX

x† is referred to as a generalized solution of (P)

Nicolas Rougon IMA4509 - Regularization theory

slide-27
SLIDE 27

Direct & Inverse problems Restoring well-posedness Regularization Problem statement Pseudo-inverse Generalized inverses

Generalized inverse

◮ x† is unique An inverse operator can therefore be defined Generalized inverse The generalized inverse of H is the operator H† : Y → X such that H†y = x† H† is continuous x† ∈ Ker⊥(H) ◮ The problem (P†) is well-posed iff. Sp(H) is closed ◮ Generalized inverses may be trivial or unrealistic w.r.t. to (P)

Nicolas Rougon IMA4509 - Regularization theory

slide-28
SLIDE 28

Direct & Inverse problems Restoring well-posedness Regularization Problem statement Pseudo-inverse Generalized inverses

Problem statement

Idea Introduce prior knowledge by enforcing constraints on solutions

+ xC y xC +

Y

Cx C x

X Z Ker(H ) Sp(H )

Assumptions Sp(H) is closed H is not injective constraint space Z (Hilbert) linear constraint operator C : X → Z

inducing a semi-norm CxZ over X

Nicolas Rougon IMA4509 - Regularization theory

slide-29
SLIDE 29

Direct & Inverse problems Restoring well-posedness Regularization Problem statement Pseudo-inverse Generalized inverses

C-generalized solutions

C-generalized inverse problem (P†

C)

Given data y ∈ Y , find a minimizer x†

C with minimal constraint

norm of the loss function ℓ(x) = Hx − yY x†

C =

   arg min

x∈X Hx − yY

arg min

x∈X CxZ

x†

C is referred to as a C-generalized solution of (P)

Nicolas Rougon IMA4509 - Regularization theory

slide-30
SLIDE 30

Direct & Inverse problems Restoring well-posedness Regularization Problem statement Pseudo-inverse Generalized inverses

C-generalized inverse

◮ x†

C does not necessarily exist

Theorem The C-generalized solution x†

C exists when C verifies

1 Ker(C) ∩ Ker(H) = {0} 2 the domain of C is dense in X 3 C is closed 4 the pseudo-solutions in Ker(C) are mapped by H into a closed

set in Y

5 C is bounded Nicolas Rougon IMA4509 - Regularization theory

slide-31
SLIDE 31

Direct & Inverse problems Restoring well-posedness Regularization Problem statement Pseudo-inverse Generalized inverses

C-generalized inverse

◮ Under the previous conditions, an inverse operator can be defined C-generalized inverse The C-generalized inverse of H is the operator H†

C : Y → X such

that H†

C y = x† C

H†

C is bounded

If C has a bounded inverse H†

C = C −1

HC −1†

Nicolas Rougon IMA4509 - Regularization theory

slide-32
SLIDE 32

Direct & Inverse problems Restoring well-posedness Regularization Problem statement Pseudo-inverse Generalized inverses

Well-posedness

◮ The problem (P†

C) can be ill-posed

◮ Differential operators are not bounded Consequently, the C-generalized inverse framework does not allow for integrating smoothness constraints local geometric constraints which play a pivotal role in image analysis

Nicolas Rougon IMA4509 - Regularization theory

slide-33
SLIDE 33

Direct & Inverse problems Restoring well-posedness Regularization Deterministic regularization Stochastic regularization

Outline

1

Direct & Inverse problems

2

Restoring well-posedness

3

Regularization

Nicolas Rougon IMA4509 - Regularization theory

slide-34
SLIDE 34

Direct & Inverse problems Restoring well-posedness Regularization Deterministic regularization Stochastic regularization

Problem statement

Idea Find x ∈ X ensuring the best trade-off between prior constraints and closeness of Hx to data y ∈ Y

xλ y xλ

Y

Cx C x

X Z Ker(H ) Sp(H )

Assumptions Sp(H) is not closed H is not injective This is the case for kernel operators Hx = K ⋆ x

Nicolas Rougon IMA4509 - Regularization theory

slide-35
SLIDE 35

Direct & Inverse problems Restoring well-posedness Regularization Deterministic regularization Stochastic regularization

Regularized solutions

Regularized inverse problem (Pλ) Given y ∈ Y , find a minimizer xλ of the loss function Eλ(x) such that xλ = arg min

x∈X

  • Hx − y2

Y + λ Cx2 Z

  • xλ is referred to as a regularized solution of (P)

The regularization functional (energy) Eλ(x) comprises

a data consistency term Hx − y2

Y

a model term Cx2

Z defining prior constraints on solutions

The regularization parameter λ > 0 controls the trade-off between data and prior knowledge

when data noise ր (i.e. data reliability ց ), the model influence can be strengthened by letting λ ր

Nicolas Rougon IMA4509 - Regularization theory

slide-36
SLIDE 36

Direct & Inverse problems Restoring well-posedness Regularization Deterministic regularization Stochastic regularization

Tikhonov stabilizers

◮ Admissible constraints Tikhonov stabilizers Cx2

Z = n

  • i=1
  • ci(ξ)
  • Dix(ξ)
  • 2 dξ

The potential function ci(ξ)

  • Dix(ξ)
  • 2 defines a quadratic

smoothness constraint of order i, enforcing Ci-continuity The weighting functions ci(ξ) > 0 control locally the smoothness of solutions. They are usually chosen constant ◮ Oversmoothing artefacts Quadratic smoothness contraints do not allow for preserving salient data discontinuities

Nicolas Rougon IMA4509 - Regularization theory

slide-37
SLIDE 37

Direct & Inverse problems Restoring well-posedness Regularization Deterministic regularization Stochastic regularization

Regularized inverse

Euler equations Regularized solutions xλ verify the following Euler equations

  • H∗H + λC ∗C
  • x = H∗y

Regularized inverse The regularized inverse of H is the operator Rλ : Y → X defined by Rλ =

  • H∗H + λC ∗C

−1H∗ Rλ is continuous lim

λ→0 Rλ y = H† y

Nicolas Rougon IMA4509 - Regularization theory

slide-38
SLIDE 38

Direct & Inverse problems Restoring well-posedness Regularization Deterministic regularization Stochastic regularization

Well-posedness

Theorem (Tikhonov-Arsenin) The regularized problem (Pλ) is well-posed Its solution is xλ = Rλy ◮ In many cases, H∗ is not defined under closed form Hence, xλ cannot be computed analytically

−3 −2 −1 1 2 3 −3 −2 −1 1 2 3 −10 −5 5 10

x0 xλ E(x)

Minimizing Eλ(x) is then performed iteratively using gradient descent techniques

Nicolas Rougon IMA4509 - Regularization theory

slide-39
SLIDE 39

Direct & Inverse problems Restoring well-posedness Regularization Deterministic regularization Stochastic regularization

Regularization filtering

◮ Singular Value Decomposition (SVD) of H H uk = αk vk H∗ vk = αk uk uk (vk): singular functions of H (H∗) → basis of solution space X (data space Y ) αk: singular values of H ( lim

k→∞ αk = 0 )

◮ Assume C ∗Cuk = c2

k uk

Pseudo-solution ˆ x =

  • k

1 αk y, vkY uk – data high-frequencies y, vkY (k ≫ 1) are amplified Regularized solution xλ =

  • k

α2

k

α2

k + λc2 k

1 αk y, vkY uk – xλ is a filtered version of ˆ x – its coordinates are bounded due to the constraint terms c2

k

Nicolas Rougon IMA4509 - Regularization theory

slide-40
SLIDE 40

Direct & Inverse problems Restoring well-posedness Regularization Deterministic regularization Stochastic regularization

Regularization filtering

◮ Proof: Recall ˆ x = (H∗H)−1H∗y (uk, α2

k) are eigenelements of H∗H

  • H uk = αk vk

H∗vk = αk uk ⇒ H∗H uk = α2

k uk

It follows that (uk, 1

α2

k ) are eigenelements of (H∗H)−1

Expressing y in the basis (vk) of Y, we get successively y =

  • y, vkY vk

H∗y =

  • αk y, vkY uk

(H∗H)−1H∗y = 1 αk y, vkY uk

  • Nicolas Rougon

IMA4509 - Regularization theory

slide-41
SLIDE 41

Direct & Inverse problems Restoring well-posedness Regularization Deterministic regularization Stochastic regularization

Regularization filtering

◮ Proof (cont’d): Recall ˆ x =

  • H∗H + λC ∗C

−1H∗y Expressing x in the basis (uk) of X, we have Dnx =

  • Dnx, ukX uk

Since C is a differential operator, it follows that C ∗C is diagonal over (uk) Hence, (uk,

1 α2

k+λc2 k ) are eigenelements of

  • H∗H + λC ∗C

−1 Expressing y in the basis (vk) of Y, we get successively y =

  • y, vkY vk

H∗y =

  • αk y, vkY uk
  • H∗H + λC ∗C

−1H∗y =

  • αk

αk + λc2

k

y, vkY uk

  • Nicolas Rougon

IMA4509 - Regularization theory

slide-42
SLIDE 42

Direct & Inverse problems Restoring well-posedness Regularization Deterministic regularization Stochastic regularization

Example | Variational image denoising

◮ Problem statement Given a noisy image Ln (≡ y), estimate an optimal C1-continuous denoised version L (≡ x) of Ln ◮ Modelling Assume a simple additive noise model: Ln = L + n Model (Ln, L) as functions over some domain Ω ⊂ R2 with pixel location x = (x, y) (≡ η) L can be found as a minimizer of the loss function Eλ(L) =

  • L(x) − Ln(x)

2 dx + λ

  • ∇L(x)
  • 2 dx

data link regularization

Nicolas Rougon IMA4509 - Regularization theory

slide-43
SLIDE 43

Direct & Inverse problems Restoring well-posedness Regularization Deterministic regularization Stochastic regularization

Nonlinear regularization

◮ Tikhonov theory extends to nonlinear regularization based on non-quadratic discontinuity-preserving stabilizers 1st-order discontinuity-preserving stabilizers Cx2

Z =

  • ϕ
  • |Dx(ξ)|

1 ϕ(√u) positive, strictly concave 2

ϕ′(u) u

bounded, ϕ′′(u) > 0

3

lim

u→0 ϕ′(u) u

= lim

u→0 ϕ′′(u) = 1

4

lim

u→+∞ ϕ′(u) u

= lim

u→+∞ ϕ′′(u) = 0

5

lim

u→+∞ ϕ′′(u)

  • ϕ′(u)

u

= 0

Nicolas Rougon IMA4509 - Regularization theory

slide-44
SLIDE 44

Direct & Inverse problems Restoring well-posedness Regularization Deterministic regularization Stochastic regularization

Nonlinear regularization

◮ Some classical discontinuity-preserving norms name ϕ(u)

ϕ′(u) u

Welsch 1 − e−u2 e−u2 Cauchy ln

  • 1 + u2

1 1+u2

Geman-McClure

u2 1+u2 1 (1+u2)2

Green ln cosh(u)

tanh(u) u

L1 - L2 √ 1 + u2 − 1

1 √ 1+u2

Total Variation (TV) u

1 u

Nicolas Rougon IMA4509 - Regularization theory

slide-45
SLIDE 45

Direct & Inverse problems Restoring well-posedness Regularization Deterministic regularization Stochastic regularization

Hyperparameter estimation

◮ Optimal values for the regularization parameter λ can be derived when upper-bounds εs on CxZ (constraint satisfaction) or εd on Hx − yY (data noise) are known εs known: λ is estimated as a Lagrange multiplier arg min

x∈X Hx − yY

u.c. CxZ ≤ εs ⇒ ∃λ    xλ = arg min

x∈X Eλ(x)

CxλZ = εs εd known: λ is chosen using Morozov’s discrepancy principle arg min

x∈X CxZ

u.c. Hx − yY ≤ εd ⇒ ∃λ    xλ = arg min

x∈X CxZ

Hxλ − yY = εd εs, εd known: λ = εd εs 2

Nicolas Rougon IMA4509 - Regularization theory

slide-46
SLIDE 46

Direct & Inverse problems Restoring well-posedness Regularization Deterministic regularization Stochastic regularization

Hyperparameter estimation

◮ When these upper-bounds are not available, a sub-optimal value for λ can be derived using cross-validation techniques Build partial datasets y ]k[ leaving out k data: y = y ]k[ ∪ y[k] Minimize Eλ(x) on y ]k[ ◮ solutions x ]k[

λ

Denote K the set of all possible choices for [k] Choose λ as a minimizer of the k-fold cross-validation function V (λ) = 1 |K|

  • [k] ∈ K
  • Hx ]k[

λ

− y[k]

  • Y

◮ There is no reference method Most often, hyperparameters are tuned empirically from experimental assessment over a reference database

Nicolas Rougon IMA4509 - Regularization theory

slide-47
SLIDE 47

Direct & Inverse problems Restoring well-posedness Regularization Deterministic regularization Stochastic regularization

Stochastic regularization

◮ Data y and unknowns x are modeled as random variables Stochastic inverse problem (P) Given noisy data y ∈ Y , estimate x ∈ X such that y = Hx + n ◮ Gaussian case Data: n is a white Gaussian noise with covariance Σn = σ2

n I

P(y|x) ∝ exp

  • − 1

2σ2

n

Hx − y2

Y

  • Model: x is a centered Gaussian process with covariance Σx

P(x) ∝ exp

  • −1

2

  • x, Σ−1

x x

  • X
  • Nicolas Rougon

IMA4509 - Regularization theory

slide-48
SLIDE 48

Direct & Inverse problems Restoring well-posedness Regularization Deterministic regularization Stochastic regularization

Stochastic regularization

◮ Maximum A Posteriori (MAP) estimate of x x∗ = arg max

x∈X P(x|y)

Using Bayes’ formula: P(x|y) ∝ P(y|x) P(x) x∗ = arg min

x∈X

1 σ2

n

  • Hx − y2

Y + σ2 n

  • x, Σ−1

x x

The MAP criterion is similar to a regularization functional

Hx − y2

Y is a data consistency term

  • x, Σ−1

x x

  • is a prior model term

σ2

n is a regularization parameter

◮ These conclusions hold if x obeys an arbitrary exponential law

Nicolas Rougon IMA4509 - Regularization theory

slide-49
SLIDE 49

Direct & Inverse problems Restoring well-posedness Regularization Deterministic regularization Stochastic regularization

Wiener filtering

Euler equations The MAP estimate x∗ verifies the following Euler equations

  • H∗ΣnH + Σx
  • x = H∗Σny

Wiener-Kolmogorov filter The Wiener-Kolmogorov filter FW : Y → X is defined by FW =

  • H∗ΣnH + Σx

−1H∗Σn The MAP estimate is x∗ = FW y FW can be viewed as extending to the stochastic framework the regularized inverse Rλ =

  • H∗H + λC ∗C

−1H∗

Nicolas Rougon IMA4509 - Regularization theory

slide-50
SLIDE 50

Direct & Inverse problems Restoring well-posedness Regularization Deterministic regularization Stochastic regularization

Random Fields

◮ Let x =

  • η ∈ V be a collection of random variables over a

finite undirected graph G = (V, E) with vertices V and edges E, with values in a discrete phase space Λ x defines a random field over G with state space ΛV ◮ In digital imaging G ≡ pixel grid with a local topology (e.g. 4/8-connectivity)

– vertices are known as sites – edges define neighborhood relationships

x ≡ visual information map

– image (restoration | reconstruction) – labels (segmentation | classification) – features (motion | stereo disparity | . . . )

ΛV ≡ set of visual information maps over G

Nicolas Rougon IMA4509 - Regularization theory

slide-51
SLIDE 51

Direct & Inverse problems Restoring well-posedness Regularization Deterministic regularization Stochastic regularization

Random Fields

◮ Denote Nη the neighborhood of vertex η ∈ V Nη =

  • ξ ∈ V | (η, ξ) ∈ E
  • Nη = {η} ∪ Nη the closed neighborhood of η ∈ V

xS =

  • η ∈ S the restriction of x to subset S ⊆ V

Example: 4-connected grid Nη Nη V \ Nη

Nicolas Rougon IMA4509 - Regularization theory

slide-52
SLIDE 52

Direct & Inverse problems Restoring well-posedness Regularization Deterministic regularization Stochastic regularization

Markov Random Fields

Markov Random Field (MRF) x is a MRF iff.

1 [Positivity] P(x) > 0 2 [Markovianity] At any η ∈ V, the probability of observing xη

given realizations of x at all other vertices depends only on realizations at neighboring vertices Nη P

  • xη | x V \{η}
  • = P
  • xη | xNη
  • Nη is the set of vertices which are

statistically related to η w.r.t. x Statistical dependence between vertices within Nη is referred to as site interactions Nη

Nicolas Rougon IMA4509 - Regularization theory

slide-53
SLIDE 53

Direct & Inverse problems Restoring well-posedness Regularization Deterministic regularization Stochastic regularization

Markov Random Fields

◮ Markovianity is equivalent to stating conditional independance of xη and x V \ Nη given xNη P

  • xη, x V \ Nη | xNη
  • = P
  • xη | xNη
  • P
  • x V \ Nη | xNη
  • Nη separates V into 2 sets of vertices {η} and V \ Nη which

are statistically independent w.r.t. x Nη Nη V \ Nη ◮ The spatial geometry of site interactions is captured by cliques

Nicolas Rougon IMA4509 - Regularization theory

slide-54
SLIDE 54

Direct & Inverse problems Restoring well-posedness Regularization Deterministic regularization Stochastic regularization

Cliques

◮ Given a graph G = (V, E), a nth-order clique is an n-tuple of vertices Cn in which any distinct vertices are neighbors Cn =

  • (η1, · · · , ηn) ∈ Vn | (ηi, ηj) ∈ E, i = j
  • The set of cliques (nth-order cliques) of G is denoted as C (Cn)

Nη C1 C2 C3 C4

Nicolas Rougon IMA4509 - Regularization theory

slide-55
SLIDE 55

Direct & Inverse problems Restoring well-posedness Regularization Deterministic regularization Stochastic regularization

Markov Random Fields

◮ Clique factorization If x is a MRF, its (joint) probability distribution can be factorized

  • ver the cliques of G

P(x) =

  • C ∈ C

φC(xC) where

  • φC
  • C∈C is a family of non-negative functions acting on

restrictions of x to cliques of G This also holds locally for conditional probabilities P

  • xη | xNη
  • which factorize over the cliques of Nη

Nicolas Rougon IMA4509 - Regularization theory

slide-56
SLIDE 56

Direct & Inverse problems Restoring well-posedness Regularization Deterministic regularization Stochastic regularization

Gibbs Fields

Gibbs field x is a Gibbs field if x has a Gibbs distribution P(x) = 1 Z exp

  • −E(x)
  • defined by a global energy function E(x) deriving from a family
  • VC
  • C∈C of local potentials on cliques:

E(x) =

  • C ∈ C

VC(x) Z is a normalizing constant known as the partition function Z =

  • x ∈ ΛV

exp

  • −E(x)
  • If a clique C is not involved, then VC = 0

Computing Z is intractable

Nicolas Rougon IMA4509 - Regularization theory

slide-57
SLIDE 57

Direct & Inverse problems Restoring well-posedness Regularization Deterministic regularization Stochastic regularization

Hammersley-Clifford Theorem

Theorem (Hammersley-Clifford) x is a MRF iff. x has a Gibbs distribution if x is a Gibbs field, then x satisfies the Markovianity property if x is a MRF with probability distribution P, there exists a family

  • VC
  • C∈C of potentials on cliques such that

P(x) = 1 Z exp

  • C ∈ C

VC(x)

  • ◮ An nth-order MRF operates only/at most on nth-order cliques Cn

Nicolas Rougon IMA4509 - Regularization theory

slide-58
SLIDE 58

Direct & Inverse problems Restoring well-posedness Regularization Deterministic regularization Stochastic regularization

Hammersley-Clifford Theorem

◮ The MRF-Gibbs equivalence holds also in local form for site conditional probabilities P

  • xη | xNη
  • =

1 Zη exp

  • −E(xη | xNη)
  • defined by a local conditional energy deriving from a family of

potentials on cliques of Nη E(xη | xNη) =

  • C ∈ C | η ∈ C

VC(x) and a local partition function Zη =

  • λ ∈ Λ

exp

  • −E(λ | xNη)
  • Nicolas Rougon

IMA4509 - Regularization theory

slide-59
SLIDE 59

Direct & Inverse problems Restoring well-posedness Regularization Deterministic regularization Stochastic regularization

Clique potentials

◮ A MRF is completely defined by local clique potentials

  • VC
  • C∈C

Potentials model pixel interactions within a clique They can be viewed as neighborhood operators ◮ Differential kernels are neighborhood operators MRFs allow for enforcing discrete smoothness priors within the MAP estimation framework For instance, choosing VC(x) as a gradient filter norm yields a 1st-order MRF

Nicolas Rougon IMA4509 - Regularization theory

slide-60
SLIDE 60

Direct & Inverse problems Restoring well-posedness Regularization Deterministic regularization Stochastic regularization

Example | Stochastic image denoising

◮ Problem statement Given a noisy image Ln (≡ y), estimate an optimal C1-continuous denoised version L (≡ x) of Ln ◮ Modelling Assume an additive Gaussian noise model: Ln = L + n with n ∼ N(0, σ2

n)

Model L as a 1st-order MRF over some 4-connected grid G ⊂ Z2 with site η = (i, j) and neighbors denoted as ξ ∼ η A MAP estimate of L is found by minimizing the energy Eλ(L) = 1 σ2

n

  • η ∈ G
  • Lη − Ln

η

2 + λ

  • η ∈ G
  • ξ ∼ η
  • Lη − Lξ

2 data link regularization

Nicolas Rougon IMA4509 - Regularization theory

slide-61
SLIDE 61

Direct & Inverse problems Restoring well-posedness Regularization Deterministic regularization Stochastic regularization

Markov Random Fields

◮ Graphical representation MRFs can be described with graph representations. They capture conditional independence relationships between interacting RVs, and provide powerful tools for inference and learning

image data y

  • bserved

solution estimate x latent | hidden

MRFs are an instance of a broader class of stochastic models known as (undirect) probabilistic graphical models (PGMs)

Nicolas Rougon IMA4509 - Regularization theory

slide-62
SLIDE 62

Direct & Inverse problems Restoring well-posedness Regularization Deterministic regularization Stochastic regularization

Deterministic vs. stochastic regularization

◮ Deterministic and stochastic regularization optimize criteria with similar structure but context-specific interpretations, i.e. a loss function Eλ(x) or a posterior conditional probability P(x|y) ◮ They differ on the optimization approach Deterministic regularization uses either variational methods to zero the derivative ∂xEλ(x), or graph-cut methods to directly minimize Eλ(x) Stochastic regularization uses Monte-Carlo or graph-cut methods to directly maximize the posterior P(x|y)

Nicolas Rougon IMA4509 - Regularization theory

slide-63
SLIDE 63

Direct & Inverse problems Restoring well-posedness Regularization Deterministic regularization Stochastic regularization

Deterministic vs. stochastic regularization

modeling

  • ptimization

approach loss / energy differentiability detected extrema deterministic variational yes local∗ graph-cut no global stochastic Monte-Carlo no global graph-cut no global

∗ in the vicinity of the initialization

◮ Direct optimization approaches apply to a broader class of criteria and provide increased robustness w.r.t. spurious local extrema induced by noise

Nicolas Rougon IMA4509 - Regularization theory

slide-64
SLIDE 64

Direct & Inverse problems Restoring well-posedness Regularization Deterministic regularization Stochastic regularization

Regularization Theory

Nicolas Rougon

Institut Mines-Télécom / Télécom SudParis ARTEMIS Department; CNRS UMR 8145 nicolas.rougon@telecom-sudparis.eu

May 10, 2020

Nicolas Rougon IMA4509 - Regularization theory