Some applications of proximal methods Caroline CHAUX Joint work - - PowerPoint PPT Presentation

some applications of proximal methods
SMART_READER_LITE
LIVE PREVIEW

Some applications of proximal methods Caroline CHAUX Joint work - - PowerPoint PPT Presentation

G ENERAL CONTEXT P ROXIMAL TOOLS A PPLICATIONS C ONCLUSION Some applications of proximal methods Caroline CHAUX Joint work with P. L. Combettes, L. Duval, J.-C. Pesquet and N. Pustelnik LATP - UMR CNRS 7353, Aix-Marseille Universit e, France


slide-1
SLIDE 1

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

Some applications of proximal methods

Caroline CHAUX

Joint work with P. L. Combettes, L. Duval, J.-C. Pesquet and N. Pustelnik

LATP - UMR CNRS 7353, Aix-Marseille Universit´ e, France

OSL 2013, Les Houches, 7-11 Jan. 2013

1 / 34

slide-2
SLIDE 2

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

Direct problem =

2 / 34

slide-3
SLIDE 3

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

Direct problem z =

  • z: observations (e.g. 2D signal of size M = M1 × M2)

2 / 34

slide-4
SLIDE 4

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

Direct problem z = y

  • z: observations (e.g. 2D signal of size M = M1 × M2)
  • y: original signal (unknown of size N)

2 / 34

slide-5
SLIDE 5

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

Direct problem z = Ly

  • z: observations (e.g. 2D signal of size M = M1 × M2)
  • y: original signal (unknown of size N)
  • L: linear operator (matrix of size M × N)

2 / 34

slide-6
SLIDE 6

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

Direct problem z = Dα(Ly)

  • z: observations (e.g. 2D signal of size M = M1 × M2)
  • y: original signal (unknown of size N)
  • L: linear operator (matrix of size M × N)
  • Dα: perturbation of parameter α

2 / 34

slide-7
SLIDE 7

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

Direct problem z = Dα(Ly)

  • z: observations (e.g. 2D signal of size M = M1 × M2)
  • y: original signal (unknown of size N)
  • L: linear operator (matrix of size M × N)
  • Dα: perturbation of parameter α

Objective: inverse problem Find an estimation ˆ y of y from observations z.

2 / 34

slide-8
SLIDE 8

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

FRAME REPRESENTATION F∗

Frame coefficients (x) Original (y)

◮ x ∈ RK : Frame coefficients of original image y ∈ RN ◮ F∗ : RK → RN : Frame synthesis operator such that

∃(ν, ν) ∈]0, +∞[2, νId ≤ F∗ ◦ F ≤ νId (tight frame when ν = ν = ν) ¯ y = F∗x

[L. Jacques et al., 2011]

3 / 34

slide-9
SLIDE 9

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

VARIATIONAL APPROACH

minimizex∈H

J

  • j=1

fj(Ljx)

where (fj)1≤j≤J: functions in the class Γ0(Gj) (class of l.s.c. proper convex functions on Gj taking their values in ] − ∞, +∞]) and where, for every j ∈ {1, . . . , J}, Lj : H → Gj is a bounded linear operator (where (Gj)1≤j≤J denote Hilbert spaces). This criterion can be non differentiable.

4 / 34

slide-10
SLIDE 10

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

VARIATIONAL APPROACH

minimizex∈H

J

  • j=1

fj(Ljx)

where (fj)1≤j≤J: functions in the class Γ0(Gj) (class of l.s.c. proper convex functions on Gj taking their values in ] − ∞, +∞]) and where, for every j ∈ {1, . . . , J}, Lj : H → Gj is a bounded linear operator (where (Gj)1≤j≤J denote Hilbert spaces). This criterion can be non differentiable.

◮ fj can be related to noise (e.g. a quadratic term when the

noise is Gaussian)

4 / 34

slide-11
SLIDE 11

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

VARIATIONAL APPROACH

minimizex∈H

J

  • j=1

fj(Ljx)

where (fj)1≤j≤J: functions in the class Γ0(Gj) (class of l.s.c. proper convex functions on Gj taking their values in ] − ∞, +∞]) and where, for every j ∈ {1, . . . , J}, Lj : H → Gj is a bounded linear operator (where (Gj)1≤j≤J denote Hilbert spaces). This criterion can be non differentiable.

◮ fj can be related to noise (e.g. a quadratic term when the

noise is Gaussian)

◮ fj can be related to some a priori on the target solution (e.g.

an a priori on the wavelet coefficient distribution)

4 / 34

slide-12
SLIDE 12

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

VARIATIONAL APPROACH

minimizex∈H

J

  • j=1

fj(Ljx)

where (fj)1≤j≤J: functions in the class Γ0(Gj) (class of l.s.c. proper convex functions on Gj taking their values in ] − ∞, +∞]) and where, for every j ∈ {1, . . . , J}, Lj : H → Gj is a bounded linear operator (where (Gj)1≤j≤J denote Hilbert spaces). This criterion can be non differentiable.

◮ fj can be related to noise (e.g. a quadratic term when the

noise is Gaussian)

◮ fj can be related to some a priori on the target solution (e.g.

an a priori on the wavelet coefficient distribution)

◮ fj can be related to a constraint (e.g. a support constraint)

4 / 34

slide-13
SLIDE 13

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

VARIATIONAL APPROACH

minimizex∈H

J

  • j=1

fj(Ljx)

where (fj)1≤j≤J: functions in the class Γ0(Gj) (class of l.s.c. proper convex functions on Gj taking their values in ] − ∞, +∞]) and where, for every j ∈ {1, . . . , J}, Lj : H → Gj is a bounded linear operator (where (Gj)1≤j≤J denote Hilbert spaces). This criterion can be non differentiable.

◮ fj can be related to noise (e.g. a quadratic term when the

noise is Gaussian)

◮ fj can be related to some a priori on the target solution (e.g.

an a priori on the wavelet coefficient distribution)

◮ fj can be related to a constraint (e.g. a support constraint) ◮ Lj can model a blur operator.

4 / 34

slide-14
SLIDE 14

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

VARIATIONAL APPROACH

minimizex∈H

J

  • j=1

fj(Ljx)

where (fj)1≤j≤J: functions in the class Γ0(Gj) (class of l.s.c. proper convex functions on Gj taking their values in ] − ∞, +∞]) and where, for every j ∈ {1, . . . , J}, Lj : H → Gj is a bounded linear operator (where (Gj)1≤j≤J denote Hilbert spaces). This criterion can be non differentiable.

◮ fj can be related to noise (e.g. a quadratic term when the

noise is Gaussian)

◮ fj can be related to some a priori on the target solution (e.g.

an a priori on the wavelet coefficient distribution)

◮ fj can be related to a constraint (e.g. a support constraint) ◮ Lj can model a blur operator. ◮ Lj can model a gradient operator (e.g. total variation).

4 / 34

slide-15
SLIDE 15

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

VARIATIONAL APPROACH

minimizex∈H

J

  • j=1

fj(Ljx)

where (fj)1≤j≤J: functions in the class Γ0(Gj) (class of l.s.c. proper convex functions on Gj taking their values in ] − ∞, +∞]) and where, for every j ∈ {1, . . . , J}, Lj : H → Gj is a bounded linear operator (where (Gj)1≤j≤J denote Hilbert spaces). This criterion can be non differentiable.

◮ fj can be related to noise (e.g. a quadratic term when the

noise is Gaussian)

◮ fj can be related to some a priori on the target solution (e.g.

an a priori on the wavelet coefficient distribution)

◮ fj can be related to a constraint (e.g. a support constraint) ◮ Lj can model a blur operator. ◮ Lj can model a gradient operator (e.g. total variation). ◮ Lj can model a frame operator.

4 / 34

slide-16
SLIDE 16

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

ANALYSIS APPROACH VS. SYNTHESIS

When frame decompositions are considered, the problem can be formulated under a:

5 / 34

slide-17
SLIDE 17

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

ANALYSIS APPROACH VS. SYNTHESIS

When frame decompositions are considered, the problem can be formulated under a: Synthesis Form (SF): minimize

x∈RK R

  • r=1

fr(LrF∗x) +

S

  • s=1

gs(x)

5 / 34

slide-18
SLIDE 18

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

ANALYSIS APPROACH VS. SYNTHESIS

When frame decompositions are considered, the problem can be formulated under a: Synthesis Form (SF): minimize

x∈RK R

  • r=1

fr(LrF∗x) +

S

  • s=1

gs(x) Analysis Form (AF): minimize

y∈RN R

  • r=1

fr(Lry) +

S

  • s=1

gs(Fy).

5 / 34

slide-19
SLIDE 19

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

ANALYSIS APPROACH VS. SYNTHESIS

When frame decompositions are considered, the problem can be formulated under a: Synthesis Form (SF): minimize

x∈RK R

  • r=1

fr(LrF∗x) +

S

  • s=1

gs(x) Analysis Form (AF): minimize

y∈RN R

  • r=1

fr(Lry) +

S

  • s=1

gs(Fy). Inclusion AF is a particular case of SF [Chaˆ

ari et al., 2009].

Equivalence Equivalence when F is an orthonormal transform.

5 / 34

slide-20
SLIDE 20

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

PROXIMAL APPROACHES

The proximity operator of φ ∈ Γ0(H) is defined as proxφ : H → H: u → arg min

v∈H

1 2 v − u2 + φ(v).

6 / 34

slide-21
SLIDE 21

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

PROXIMAL APPROACHES

The proximity operator of φ ∈ Γ0(H) is defined as proxφ : H → H: u → arg min

v∈H

1 2 v − u2 + φ(v). Remark: if C is a nonempty closed convex set of H, and ιC denotes the indicator function of C, i.e., (∀u ∈ H) ιC(u) = 0 if u ∈ C, +∞ otherwise, then, proxιC reduces to the projection ΠC

  • nto C.

6 / 34

slide-22
SLIDE 22

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

PROXIMAL APPROACHES

The proximity operator of φ ∈ Γ0(H) is defined as proxφ : H → H: u → arg min

v∈H

1 2 v − u2 + φ(v). Remark: if C is a nonempty closed convex set of H, and ιC denotes the indicator function of C, i.e., (∀u ∈ H) ιC(u) = 0 if u ∈ C, +∞ otherwise, then, proxιC reduces to the projection ΠC

  • nto C.
  • Let φ ∈ Γ0(G), L: H → G a bounded linear operator. Suppose

LL∗ = χ I , for some χ ∈ ]0, +∞[. Then proxφ◦L = I +χ−1L∗(proxχφ − I )L .

6 / 34

slide-23
SLIDE 23

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

Minimize

x

J

j fj(x)

◮ When J = 2: Forward-Backward algorithm [Figueiredo and Nowak, 2003][Bect et al., 2004][Daubechies et al., 2004][Combettes and Wajs, 2005][Chaux et al., 2007][Beck and Teboulle, 2009], Douglas-Rachford

algorithm [Lions and Mercier, 1979][Combettes and Pesquet, 2007]

◮ When J > 2: Parallel ProXimal Algorithm (PPXA) [Combettes and Pesquet, 2008]

7 / 34

slide-24
SLIDE 24

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

PPXA+: minimize

u∈H

J

j=1 fj(Lju)

Initialization

          (ǫj)1≤j≤J ∈ [0, 1[J, (ωj)1≤j≤J ∈ ]0, +∞[J , (λn)n∈N a sequence of reals, (z[0]

j )1≤j≤J ∈ (Gj)J, (p[−1] j

)1≤j≤J ∈ (Gj)J, u[0] = argminu∈H J

j=1 ωjLju − z[0] j 2

For every j ∈ {1, . . . , J}, (a[n]

j )n∈N a sequence of reals,

For n = 0, 1, . . .               For j = 1, . . . , J ⌊ p[n]

j

= prox (1−ǫj)fj

ωj

  • (1 − ǫj)z[n]

j

+ ǫjp[n−1]

j

  • + a[n]

j

c[n] = arg minu∈H J

j=1 ωjLju − p[n] j 2

For j = 1, . . . , J ⌊ z[n+1]

j

= z[n]

j

+ λn

  • Lj(2c[n] − u[n]) − p[n]

j

  • u[n+1] = u[n] + λn(c[n] − u[n])

8 / 34

slide-25
SLIDE 25

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

PPXA+: minimize

u∈H

J

j=1 fj(Lju)

Initialization

          (ǫj)1≤j≤J ∈ [0, 1[J, (ωj)1≤j≤J ∈ ]0, +∞[J , (λn)n∈N a sequence of reals, (z[0]

j )1≤j≤J ∈ (Gj)J, (p[−1] j

)1≤j≤J ∈ (Gj)J, u[0] = argminu∈H J

j=1 ωjLju − z[0] j 2

For every j ∈ {1, . . . , J}, (a[n]

j )n∈N a sequence of reals,

For n = 0, 1, . . .               For j = 1, . . . , J ⌊ p[n]

j

= prox (1−ǫj)fj

ωj

  • (1 − ǫj)z[n]

j

+ ǫjp[n−1]

j

  • + a[n]

j

c[n] = arg minu∈H n

j=1 ωjLju − p[n] j 2

For j = 1, . . . , J ⌊ z[n+1]

j

= z[n]

j

+ λn

  • Lj(2c[n] − u[n]) − p[n]

j

  • u[n+1] = u[n] + λn(c[n] − u[n])

8 / 34

slide-26
SLIDE 26

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

PPXA+ CONVERGENCE

Proposition [Pesquet and Pustelnik, 2012] The weak convergence of the sequence (u[n])n∈N to a minimizer

  • f J

j=1 fj ◦ Lj is established under the following assumptions:

  • 1. 0 ∈ sri {(L1v − w1, . . . , LJv − wJ) | v ∈ H, w1 ∈ dom f1, . . . , wJ ∈

dom fJ},

  • 2. There exists λ ∈]0, 2[ such that (∀n ∈ N), λ ≤ λn+1 ≤ λn,
  • 3. For every j ∈ {1, . . . , J}, a[n]

j

are absolutely summable sequences in H.

  • 4. J

j=1 ωjL∗ j Lj is an isomorphism. (PPXA+ iterations can be

slightly modified to avoid this assumption)

9 / 34

slide-27
SLIDE 27

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

PPXA+: A GENERAL FRAMEWORK

  • 1. PPXA [Combettes, Pesquet, 2008, Algorithm 3.1] is a special case of

PPXA+ corresponding to the case when ǫ1 = · · · = ǫJ = 0, G1 = · · · = GJ = H, and L1 = · · · = LJ = Id.

  • 2. The SDMM algorithm derived from DR in [Setzer et al., 2010] is

a special case of PPXA+ corresponding to the case when ǫ1 = · · · = ǫJ = 0, ω1 = · · · = ωJ, λn ≡ 1 and (a[n]

j )1≤j≤J ≡ (0, · · · , 0).

  • 3. Algorithm introduced in [Attouch and Soueycatt, 2009] is a special

case of PPXA+ corresponding to the case when ǫ1 = · · · = ǫJ = α 1 + α, (a[n]

j )1≤j≤J ≡ (0, · · · , 0).

10 / 34

slide-28
SLIDE 28

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

OTHER PROXIMAL APPROACHES: Minimize

x

J

j fj(Ljx)

◮ Parallel ProXimal Algorithm + (PPXA+) [Pesquet, Pustelnik, 2012]

In the same spirit as PPXA, requires to compute each

  • proxfj. Quadratic minimizations need to be performed in

the initialization step and in the computation of one intermediate variables ⇔ invert a large-size linear

  • perator.

◮ Generalized Forward-Backward [Raguet et al., 2012] ◮ Primal-Dual approaches:

◮ M+SFBF [Brice˜

no-Arias, Combettes, 2011]

Requires to compute each proxfj and algorithm stepsize dependent on Lj.

◮ M+LFBF [Combettes, Pesquet, 2011]

Possibility that one function fj0 is Lipschitz gradient; requires to compute the gradient of fj0 and each proxfj for j = j0. The algorithm stepsize is dependent on Lj.

◮ FB based algorithms [Chambolle, Pock,

2011],[V˜ u,2013],[Condat,2013]

11 / 34

slide-29
SLIDE 29

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

CONSTRAINED FORMULATION

Minimize

x∈H R

  • r=1

gr(Trx) s.t.        H1x ∈ C1, . . . HSx ∈ CS, where

◮ H: real Hilbert space, ◮ Γ0(H): class of proper, l.s.c, convex functions from H to

]−∞, +∞],

◮ (∀s ∈ {1, . . . , S}), Hs : H → RQs is a bounded linear operator, ◮ (∀s ∈ {1, . . . , S}), Cs is a nonempty closed convex subset of RQs, ◮ (∀r ∈ {1, . . . , R}), Tr : H → RNr is a bounded linear operator, ◮ (∀r ∈ {1, . . . , R}), gr ∈ Γ0(RNr).

12 / 34

slide-30
SLIDE 30

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

CONSTRAINED FORMULATION

For n = 0, 1, . . .                                                      x[n] = R

r=1 ωru[n] r

+ S

s=1 ωsu[n] s

← − Under technical assumptions, (x[n])n∈N generated by For r = 1, . . . , R M+SFBF [Combettes,Brice˜ no-Arias,2011] converge to ˆ x     w[n]

1,r = u[n] r

− γnT∗

r v[n] r

w[n]

2,r = v[n] r

+ γnTru[n]

r

For s = 1, . . . , S     w[n]

1,s = u[n] s

− γnH∗

s v[n] s

w[n]

2,s = u[n] s

+ γnHsu[n]

s

p[n]

1

= R

r=1 ωrw[n] 1,r + S s=1 ωsw[n] 1,s

For r = 1, . . . , R              p[n]

2,r = w[n] 2,r −

γn ωr prox ωr

γn gr

  • ωr

γn w[n]

2,r

− Proximity operator computation q[n]

1,r = p[n] 1

− γn(T∗

r p[n] 2,r)

q[n]

2,r = p[n] 2,r + γn(Trp[n] 1

) Update u[n+1]

1

and v[n+1]

1

For s = 1, . . . , S              p[n]

2,s = w[n] 2,s −

γn ωs ΠCs

  • ωs

γn w[n]

2,s

− Projection computation q[n]

1,s = p[n] 1

− γn(H∗

s p[n] 2,s)

q[n]

2,s = p[n] 2,s + γn(Hsp[n] 1 )

Update u[n+1]

1

and v[n+1]

1

13 / 34

slide-31
SLIDE 31

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

CONSTRAINED FORMULATION

(∀x ∈ H) Hsx ∈ Cs ⇔ hs(Hsx) ≤ ηs

14 / 34

slide-32
SLIDE 32

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

CONSTRAINED FORMULATION

(∀x ∈ H) Hsx ∈ Cs ⇔ hs(Hsx) ≤ ηs

  • (∀u ∈ RQ)

u ∈ C ⇔ h(u) ≤ η

14 / 34

slide-33
SLIDE 33

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

CONSTRAINED FORMULATION

(∀x ∈ H) Hsx ∈ Cs ⇔ hs(Hsx) ≤ ηs

  • (∀u ∈ RQ)

u ∈ C ⇔ h(u) ≤ η

  • (∀u = [(u(1))⊤

size Q(1)

, . . . , (u(L))⊤

size Q(L)

]⊤ ∈ RQ) u ∈ C ⇔

L

  • ℓ=1

h(ℓ)(u(ℓ)) ≤ η

14 / 34

slide-34
SLIDE 34

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

CONSTRAINED FORMULATION

(∀x ∈ H) Hsx ∈ Cs ⇔ hs(Hsx) ≤ ηs

  • (∀u ∈ RQ)

u ∈ C ⇔ h(u) ≤ η

  • (∀u = [(u(1))⊤

size Q(1)

, . . . , (u(L))⊤

size Q(L)

]⊤ ∈ RQ) u ∈ C ⇔

L

  • ℓ=1

h(ℓ)(u(ℓ)) ≤ η

→ Any closed convex subset C can be expressed in this way by setting η = 0, L = 1 and h = dC.

14 / 34

slide-35
SLIDE 35

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

CONSTRAINED FORMULATION

(∀x ∈ H) Hsx ∈ Cs ⇔ hs(Hsx) ≤ ηs

  • (∀u ∈ RQ)

u ∈ C ⇔ h(u) ≤ η

  • (∀u = [(u(1))⊤

size Q(1)

, . . . , (u(L))⊤

size Q(L)

]⊤ ∈ RQ) u ∈ C ⇔

L

  • ℓ=1

h(ℓ)(u(ℓ)) ≤ η

→ Any closed convex subset C can be expressed in this way by setting η = 0, L = 1 and h = dC. Question: What can we do if ΠC does not have a closed form ?

14 / 34

slide-36
SLIDE 36

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

EPIGRAPHICAL PROJECTION

For every u = [(u(1))⊤

size Q(1)

, . . . , (u(L))⊤

size Q(L)

]⊤ ∈ RQ, u ∈ C ⇔

L

  • ℓ=1

h(ℓ)(u(ℓ)) ≤ η. By introducing now the auxiliary vector ζ =

  • ζ(ℓ)

1≤ℓ≤L ∈ RL,

u ∈ C ⇔ L

ℓ=1 ζ(ℓ) ≤ η,

(∀ℓ ∈ {1, . . . , L}) h(ℓ)(u(ℓ)) ≤ ζ(ℓ).

15 / 34

slide-37
SLIDE 37

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

EPIGRAPHICAL PROJECTION

u ∈ C ⇔

  • ζ ∈ V

(u, ζ) ∈ E where

◮ V denotes a closed half-space such that:

V =

  • ζ ∈ RL

1⊤

L ζ ≤ η

  • ◮ E is the closed convex set associated to the epigraphical

constraint: E =

  • (u, ζ) ∈ RQ × RL

(∀ℓ ∈ {1, . . . , L}) (u(ℓ), ζ(ℓ)) ∈ epi h(ℓ)

16 / 34

slide-38
SLIDE 38

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

EPIGRAPHICAL PROJECTION

u ∈ C ⇔

  • ζ ∈ V

(u, ζ) ∈ E where

◮ V denotes a closed half-space such that:

V =

  • ζ ∈ RL

1⊤

L ζ ≤ η

  • → ΠV has a closed form: projection onto an half-space.

◮ E is the closed convex set associated to the epigraphical

constraint: E =

  • (u, ζ) ∈ RQ × RL

(∀ℓ ∈ {1, . . . , L}) (u(ℓ), ζ(ℓ)) ∈ epi h(ℓ) → ΠE has a closed form for specific choice of h(ℓ).

16 / 34

slide-39
SLIDE 39

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

EPIGRAPHICAL PROJECTION

◮ Euclidean norm functions defined as:

  • ∀ℓ ∈ {1, . . . , L}
  • ∀u(ℓ) ∈ RQ(ℓ)

h(ℓ)(u(ℓ)) = τ (ℓ)u(ℓ) where τ (ℓ) ∈ ]0, +∞[.

17 / 34

slide-40
SLIDE 40

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

EPIGRAPHICAL PROJECTION

◮ Euclidean norm functions defined as:

  • ∀ℓ ∈ {1, . . . , L}
  • ∀u(ℓ) ∈ RQ(ℓ)

h(ℓ)(u(ℓ)) = τ (ℓ)u(ℓ) where τ (ℓ) ∈ ]0, +∞[.

◮ Epigraphical projection: for every (u(ℓ), ζ(ℓ)) ∈ RQ(ℓ) × R

Πepi h(ℓ)(u(ℓ), ζ(ℓ)) =      (u(ℓ), ζ(ℓ)), if u(ℓ) < ζ(ℓ)

τ (ℓ) ,

(0, 0), if u(ℓ) < −τ (ℓ)ζ(ℓ), α(ℓ) u(ℓ), τ (ℓ)u(ℓ)

  • ,
  • therwise,

where α(ℓ) = 1 1 + (τ (ℓ))2

  • 1 + τ (ℓ)ζ(ℓ)

u(ℓ)

  • .

17 / 34

slide-41
SLIDE 41

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

EPIGRAPHICAL PROJECTION

◮ Infinity norms defined as:

  • ∀ℓ ∈ {1, . . . , L}
  • ∀u(ℓ) = (u(ℓ,m))1≤m≤Q(ℓ) ∈ RQ(ℓ)

h(ℓ)(u(ℓ)) = max

  • |u(ℓ,m)|

τ (ℓ,m) | 1 ≤ m ≤ Q(ℓ)

  • where (τ (ℓ,m))1≤m≤Q(ℓ) ∈ ]0, +∞[Q(ℓ).

Πepi h(ℓ)(u(ℓ), ζ(ℓ)) has a closed form [G. Chierchia et al., 2012].

18 / 34

slide-42
SLIDE 42

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

RECONSTRUCTION PROBLEM: PET

◮ High level of noise ◮ Large amount of data

19 / 34

slide-43
SLIDE 43

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

RECONSTRUCTION PROBLEM

z = Pα(Ay) where

◮ Pα: Poisson noise of scale parameter α ◮ A: projection matrix

20 / 34

slide-44
SLIDE 44

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

RECONSTRUCTION PROBLEM

Our objective is: min

x∈RK T

  • t=1

DKL(AF∗

t x, z) + κ tv(F∗ t x) + ιC(x) + ϑ xℓ1

y = F∗x = (F∗

t x)1≤t≤T

where κ > 0, ϑ > 0 and

◮ DKL is the Kullback-Leibler divergence ◮ tv represents a total variation term ◮ ιC is the indicator function of a closed convex set C ◮ xℓ1 denotes the ℓ1-norm.

20 / 34

slide-45
SLIDE 45

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

RECONSTRUCTION PROBLEM

Our objective is: min

x∈RK T

  • t=1

DKL(AF∗

t x, z) + κ tv(F∗ t x) + ιC(x) + ϑ xℓ1

y = F∗x = (F∗

t x)1≤t≤T

where κ > 0, ϑ > 0 and

◮ DKL is the Kullback-Leibler divergence ⇒ split into several

proximable functions

◮ tv represents a total variation term ⇒ closed form in [Combettes and Pesquet, 2008] ◮ ιC is the indicator function of a closed convex set C ⇒ projection

  • nto C

◮ xℓ1 denotes the ℓ1-norm. ⇒ soft thresholding [Chaux et al., 2007]

20 / 34

slide-46
SLIDE 46

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

RECONSTRUCTION PROBLEM

Our objective is: min

x∈RK T

  • t=1

DKL(AF∗

t x, z) + κ tv(F∗ t x) + ιC(x) + ϑ xℓ1

y = F∗x = (F∗

t x)1≤t≤T

where κ > 0, ϑ > 0 and

◮ DKL is the Kullback-Leibler divergence ◮ tv represents a total variation term ◮ ιC is the indicator function of a closed convex set C ◮ xℓ1 denotes the ℓ1-norm.

20 / 34

slide-47
SLIDE 47

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

PET RECONSTRUCTION RESULTS

Slice n

21 / 34

1

slide-48
SLIDE 48

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

PET RECONSTRUCTION RESULTS

Original SIEVES PPXA

22 / 34

slide-49
SLIDE 49

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

PET RECONSTRUCTION RESULTS

Original SIEVES PPXA

22 / 34

slide-50
SLIDE 50

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

PET RECONSTRUCTION RESULTS

Original SIEVES PPXA

22 / 34

slide-51
SLIDE 51

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

IMAGE RESTORATION WITH MISSING SAMPLES

Original: y ∈ RN Degraded: z ∈ RM

z = Ay + b

◮ y: original image in [0, 255]N

→ assumed to be sparse after some appropriate transform,

◮ A ∈ RM×N: randomly decimated convolution, ◮ b ∈ RM: realization of a zero-mean white Gaussian noise, ◮ z: degraded image of size M.

23 / 34

slide-52
SLIDE 52

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

IMAGE RESTORATION WITH MISSING SAMPLES

  • y ∈ Argmin

y∈[0,255]N

Ay − z2 s.t.

N

  • ℓ=1

Y(ℓ)p ≤ η where

◮ Y(ℓ) =

  • ωℓ,n(y(ℓ) − y(n))
  • n∈Nℓ

◮ p ≥ 1 and η > 0.

Particular cases:

◮ ℓ2 − TV: p = 2, ωℓ,n = 1, and Nℓ horizontal and vertical

neighbours,

◮ ℓ∞ − TV: p = ∞, ωℓ,n = 1, and Nℓ horizontal and vertical

neighbours,

◮ ℓ2 − NLTV: p = 2, ωℓ,n as in [Foi, Boracchi, 2012] and Nℓ as in [Gilboa, Osher, 2007], ◮ ℓ∞ − NLTV: p = ∞, ωℓ,n as in [Foi, Boracchi, 2012] and Nℓ as in [Gilboa, Osher, 2007].

24 / 34

slide-53
SLIDE 53

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

IMAGE RESTORATION WITH MISSING SAMPLES

Argmin

y

Ay − z2 s.t. N

ℓ=1 Y(ℓ)p ≤ η

y ∈ [0, 255]N

  • Argmin

y,ζ

Ay − z2 s.t.      (∀ℓ ∈ {1, . . . , N}) Y(ℓ)p ≤ ζ(ℓ) N

ℓ=1 ζ(ℓ) ≤ η

y ∈ [0, 255]N

25 / 34

slide-54
SLIDE 54

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

IMAGE RESTORATION WITH MISSING SAMPLES

2 4 6 8 10 −40 −30 −20 −10 100 200 300 400 −30 −20 −10

ℓ2-TV ℓ∞-TV

2 4 6 8 −40 −35 −30 200 400 600 −45 −40 −35 −30 −25

ℓ2-NLTV ℓ∞-NLTV.

Figure: Comparison between epigraphical method (solid line) and direct method (dashed line): y[n]−y[∞]

y[∞]

in dB vs time.

25 / 34

slide-55
SLIDE 55

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

IMAGE RESTORATION WITH MISSING SAMPLES

Culicoidae Degraded Zoom GPSR SNR: 17.03 dB ℓ2-TV ℓ∞-TV ℓ2-NLTV ℓ∞-NLTV SNR: 20.80 dB SNR: 20.25 dB SNR: 22.62 dB SNR: 22.38 dB

25 / 34

slide-56
SLIDE 56

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

IMAGE RESTORATION WITH MISSING SAMPLES

Culicoidae Degraded Zoom GPSR SNR: 20.26 dB ℓ2-TV ℓ∞-TV ℓ2-NLTV ℓ∞-NLTV SNR: 23.18 dB SNR: 22.77 dB SNR: 24.18 dB SNR: 24.14 dB

25 / 34

slide-57
SLIDE 57

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

SEISMIC DATA ACQUISITION

Figure: Principles of seismic wave propagation, with reflections on different layers, and data acquisition. Solid blue: primary; dashed red: multiple.

26 / 34

slide-58
SLIDE 58

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

OBSERVATION MODEL

z(n) = s(n) + y(n) where

◮ n ∈ {0, · · · , N − 1}: the time index ◮ z = (z(n))0≤n<N: the observed data combining

  • 1. the primary y = (y(n))0≤n<N (signal of interest, unknown)
  • 2. the multiples (s(n))0≤n<N (sum of undesired reflected

signals). We assume that a template (r(n))0≤n<N (for the disturbance signal) is available and that s(n) = p′+P−1

p=p′

h(n)(p)r(n−p)

We can rewrite the problem as z = Rh + y

27 / 34

slide-59
SLIDE 59

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

MAP ESTIMATION - FILTERS h

Assumptions:

  • 1. x = Fy (where F ∈ RN×N denotes the analysis operator) is a

realization of a random vector, whose probability density function (pdf) is given by (∀x ∈ RN) fX(x) ∝ exp(−ϕ(x))

  • 2. h is a realization of a random vector, whose pdf is

expressed as (∀h ∈ RNP) fH(h) ∝ exp(−ρ(h)), and which is independent of x. MAP estimation of h minimize

h∈RNP

ϕ

  • F(z − Rh)
  • + ρ(h).
  • ϕ: data fidelity term taking into account the statistical

properties of the basis coefficients

  • ρ: prior informations that are available on h.

28 / 34

slide-60
SLIDE 60

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

CONVEX CONSTRAINTS ON THE FILTERS

Assumption: filters are varying along the time index n.

  • ∀(n, p)
  • |h(n+1)(p) − h(n)(p)| ≤ εp

The associated closed convex set is defined as C =

  • h ∈ RNP | ∀(n, p) |h(n+1)(p) − h(n)(p)| ≤ εp
  • .

29 / 34

slide-61
SLIDE 61

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

CONVEX CONSTRAINTS ON THE FILTERS

Assumption: filters are varying along the time index n.

  • ∀(n, p)
  • |h(n+1)(p) − h(n)(p)| ≤ εp

The associated closed convex set is defined as C =

  • h ∈ RNP | ∀(n, p) |h(n+1)(p) − h(n)(p)| ≤ εp
  • .

Minimization problem to be solved minimize

h∈RNP

ϕ

  • F(z − Rh)
  • +

ρ(h) + ιC1(h) + ιC2(h). Use of PPXA+ to perform the minimization.

29 / 34

slide-62
SLIDE 62

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

RESULTS: CONTEXT

◮ N = 2048; filter length: P = 14 (noise-free case), P = 10

(noisy case)

◮ PPXA+ parameters: λi ≡ 1.5,

ω1 = 10000/N, ω2 = ω1/P, ω3 = ω4 = 10 ω2;

◮ Iteration number: 10000 (stopping criterion at iteration i if

h[i+1] − h[i] < 10−5);

◮ Functions choice: ϕk ≡ | · | and

ρ = µ · 2, µ = 0.01;

◮ Basis choice: Symlet wavelets of length 8 over 3 resolution

levels.

30 / 34

slide-63
SLIDE 63

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

RESULTS: NON NOISY CASE

850 900 950 1000 1050 1100 1150 1200 1250 1300 1350

z(n) y(n) r(n) s(n) ˆ y(n) ˆ s(n)

Observed signal z Original signal y Model r Original multiple s Estimated signal y Estimated multiples s

31 / 34

slide-64
SLIDE 64

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

NOISY CASE

1020 1040 1060 1080 1100 1120 1140 1160 1180 1200 −0.3 −0.2 −0.1 0.1 0.2 0.3 1020 1040 1060 1080 1100 1120 1140 1160 1180 1200 −0.6 −0.4 −0.2 0.2 0.4 0.6

Reference signal y and estimated signal y Multiples s and estimated multiples s

32 / 34

slide-65
SLIDE 65

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

CONCLUSION

◮ Proximity operators and proximal methods are shown to

be very flexible tools for solving variational problems encountered in inverse problems.

◮ The convex criterion can be composed of various terms

modelizing data fidelity (often linked to noise statictics) and also prior information, possibly formulated under convex (hard) constraints.

◮ Frames can be used to introduce prior information. ◮ Many other applications have been investigated (pMRI,

compressive sensing, satellite imaging, stereovision, microcopy imaging,...). Future work:

◮ Use of these methods in statistical learning. ◮ Extension to the non convex case.

33 / 34

slide-66
SLIDE 66

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

CONCLUSION

◮ Proximity operators and proximal methods are shown to

be very flexible tools for solving variational problems encountered in inverse problems.

◮ The convex criterion can be composed of various terms

modelizing data fidelity (often linked to noise statictics) and also prior information, possibly formulated under convex (hard) constraints.

◮ Frames can be used to introduce prior information. ◮ Many other applications have been investigated (pMRI,

compressive sensing, satellite imaging, stereovision, microcopy imaging,...). Future work:

◮ Use of these methods in statistical learning. ◮ Extension to the non convex case.

Thank you !

33 / 34

slide-67
SLIDE 67

GENERAL CONTEXT PROXIMAL TOOLS APPLICATIONS CONCLUSION

SOME REFERENCES

  • L. Condat, “A primal-dual splitting method for convex optimization involving Lipschitzian, proximable and linear

composite terms,” J. Optimization Theory and Applications, to appear, 2013.

  • G. Chierchia, N. Pustelnik, J.-C. Pesquet, B. Pesquet-Popescu, “Epigraphical Projection and Proximal Tools for

Solving Constrained Convex Optimization Problems: Part I”, arXiv:1210.5844, 2012.

  • F. Bach, R. Jenatton, J. Mairal, and G. Obozinski, “Optimization with sparsity-inducing penalties”, Foundations and

Trends in Machine Learning, vol. 4, no. 1, pp. 1-106, 2012.

  • P. L. Combettes and J.-C. Pesquet, “Primal-dual splitting algorithm for solving inclusions with mixtures of

composite, Lipschitzian, and parallel-sum type monotone operators,” Set-Valued and Variational Analysis, vol. 20,

  • no. 2, pp. 307-330, Jun. 2012.

J.-C. Pesquet and N. Pustelnik, “A Parallel Inertial Proximal Optimization Method”, Pacific Journal of Optimization,

  • Vol. 8, No. 2, pp. 273–305, Apr. 2012.
  • N. Pustelnik, J.-C. Pesquet, C. Chaux, “Relaxing Tight Frame Condition in Parallel Proximal Methods for Signal

Restoration,” IEEE Trans. on Sig. Proc., vol. 60 , no. 2, pp. 968 - 973, Feb. 2012.

  • H. H. Bauschke and P. L. Combettes, Convex Analysis and Monotone Operator Theory in Hilbert Spaces.

Springer-Verlag, New York, 2011.

  • H. Raguet, J. Fadili, G. Peyr´

e, “Generalized forward-backward splitting”, Preprint, arXiv:1108.4404, 2011.

  • B. C. V˜

u, “A splitting algorithm for dual monotone inclusions involving cocoercive operators”, Advances in Computational Mathematics, Nov. 2011.

  • L. M. Brice˜

no-Arias and P. L. Combettes, “A monotone+skew splitting model for composite monotone inclusions in duality,” SIAM Journal on Optimization, vol. 21, no. 4, pp. 1230-1250, Oct. 2011.

  • P. L. Combettes and J.-C. Pesquet, “Proximal splitting methods in signal processing,” in: Fixed-Point Algorithms for

Inverse Problems in Science and Engineering, (H. H. Bauschke, R. S. Burachik, P. L. Combettes, V. Elser, D. R. Luke, and H. Wolkowicz, Editors), pp. 185-212. Springer-Verlag, New York, 2011.

  • A. Chambolle, T. Pock, “A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging”,

Journal of Mathematical Imaging and Vision, Vol. 40, No. 1, pp. 120-145, May 2011.

  • P. L. Combettes and J.-C. Pesquet, “A proximal decomposition method for solving convex variational inverse

problems,” Inverse Problems, vol. 24, no. 6, article ID 065014, 27 pp., Dec. 2008.

  • P. L. Combettes and J.-C. Pesquet, “A Douglas-Rachford splitting approach to nonsmooth convex variational signal

recovery,” IEEE Journal of Selected Topics in Signal Processing, vol. 1, no. 4, pp. 564-574, Dec. 2007.

  • C. Chaux, P. L. Combettes, J.-C. Pesquet, and V. R. Wajs, “A variational formulation for frame-based inverse

problems,” Inverse Problems, vol. 23, no. 4, pp. 1495-1518, Aug. 2007. 34 / 34