Decomposition by operator-splitting methods and applications in - - PowerPoint PPT Presentation

decomposition by operator splitting methods and
SMART_READER_LITE
LIVE PREVIEW

Decomposition by operator-splitting methods and applications in - - PowerPoint PPT Presentation

Decomposition by operator-splitting methods and applications in image deblurring Daniel OConnor (Department of Mathematics, UCLA) Lieven Vandenberghe (Electrical Engineering, UCLA) Workshop on Optimization for Modern Computation BICMR,


slide-1
SLIDE 1

Decomposition by operator-splitting methods and applications in image deblurring

Daniel O’Connor (Department of Mathematics, UCLA) Lieven Vandenberghe (Electrical Engineering, UCLA) Workshop on Optimization for Modern Computation BICMR, Beijing, September 2–4, 2014

slide-2
SLIDE 2

Primal-dual decomposition

minimize f(x) + g(Ax)

  • f, g are ‘simple’ convex functions (indicators of simple sets, norms, . . . )
  • A is a structured matrix
  • widely used format in literature on multiplier and splitting algorithms

This talk: decomposition by splitting primal-dual optimality conditions 0 ∈

  • AT

−A x z

  • +
  • ∂f(x)

∂g∗(z)

  • and applications in image deblurring

1/30

slide-3
SLIDE 3

Models of image blur

b = Kx + w

  • x is exact image, Kx is image blurred by blurring operator K
  • b is observed image, equal to blurred image plus noise w

Space-invariant blur: structure of K depends on boundary conditions

  • periodic: WKW H is diagonal where W is 2D DFT matrix
  • zero/replicate: K = Kc + Ks with Kc diagonalizable by DFT, Ks sparse

Space-varying blur (Nagy-O’Leary model): K =

m

  • i=1

UiKi

  • Ki: space-invariant blurring operators
  • Ui: positive diagonal matrices that sum to identity

2/30

slide-4
SLIDE 4

Deblurring via convex optimization

minimize φf(Kx − b) + φs(Dx) + φr(x) Data fidelity term φf

  • convex penalty, e.g., squared 2-norm, 1-norm, Huber penalty, . . .
  • indicator for convex set, e.g., 2-norm ball

Smoothing term φs

  • D is discretized first derivative, or a wavelet/shearlet transform matrix
  • φs is a norm, e.g., for total variation reconstruction, a sum of 2-norms

φs(u, v) = γ(u, v)iso = γ

n

  • i=1
  • u2

i + v2 i

3/30

slide-5
SLIDE 5

Deblurring via convex optimization

minimize φf(Kx − b) + φs(Dx) + φr(x) Regularization term φr

  • penalty on x
  • indicator for convex set, for example, {x | 0 ≤ x ≤ 1}

In composite form: minimize f(x) + g(Ax) with f(x) = φr(x), A =

  • K

D

  • ,

g(u, v) = φf(u − b) + φs(v)

4/30

slide-6
SLIDE 6

Outline

  • Introduction
  • Douglas-Rachford splitting method
  • Primal-dual splitting
  • Space-varying blur
slide-7
SLIDE 7

Monotone operator

A set-valued mapping F is monotone if (u − v)T(y − x) ≥ 0, ∀x, y ∈ dom F, u ∈ F(x), v ∈ F(y)

  • subdifferential ∂f of closed convex function f
  • skew-symmetric linear operator, for example,

F(x, z) =

  • AT

−A x z

  • sums of monotone operators, for example,

F(x, z) =

  • AT

−A x z

  • +
  • ∂f(x)

∂g∗(z)

  • 5/30
slide-8
SLIDE 8

Resolvent

The resolvent of a monotone operator F is the operator (I + tF)−1 (with t > 0) Properties (for maximal monotone F)

  • y = (I + tF)−1(x) exists and is unique for all x
  • y is the (unique) solution of the inclusion problem x ∈ y + tF(y)

Examples

  • resolvent of subdifferential ∂f is called proximal operator of f
  • for linear monotone F, resolvent (I + tF)−1 is matrix inverse

6/30

slide-9
SLIDE 9

Proximal operator

The proximal operator of a closed convex function f is the mapping proxtf(x) = argmin

y

  • f(y) + 1

2ty − x2

2

  • Examples
  • f(x) = δC(x) (indicator of closed convex set C): Euclidean projection

proxtf(x) = PC(x) = argmin

y

y − x2

2

  • f(x) = x: shrinkage operation

proxtf(x) = x − PtC(x), C is unit ball for dual norm

7/30

slide-10
SLIDE 10

Calculus rules for proximal operators

Separable function: if f(x1, x2) = f1(x1) + f2(x2), then proxf(x1, x2) =

  • proxf1(x1), proxf2(x2)
  • Moreau decomposition: relates prox-operators of conjugates

proxtf∗(x) + tproxt−1f(x/t) = x Composition with affine mappig: f(x) = g(Ax + b) with AAT = aI proxf(x) = (I − 1 aATA)x + 1 aAT proxag(Ax + b) − b

  • 8/30
slide-11
SLIDE 11

Douglas-Rachford splitting

Problem: given maximal monotone operators A, B, solve 0 ∈ A(x) + B(x) Algorithm (Lions & Mercier, 1979) x+ = (I + tA)−1(z) y+ = (I + tB)−1(2x+ − z) z+ = z + ρ(y+ − x+)

  • x converges under weak conditions (for any t > 0 and ρ ∈ (0, 2))
  • useful when resolvents of A, B are inexpensive, but not resolvent of sum
  • includes other well-known algorithms as special cases (e.g., ADMM)

9/30

slide-12
SLIDE 12

Outline

  • Introduction
  • Douglas-Rachford splitting method
  • Primal-dual splitting
  • Space-varying blur
slide-13
SLIDE 13

Primal-dual splitting

Composite problem and dual minimize f(x) + g(Ax) maximize −f ∗(−ATz) − g∗(z) Primal-dual optimality conditions 0 ∈

  • ∂f(x)

∂g∗(z)

  • A(x,z)

+

  • AT

−A x z

  • B(x,z)

Resolvent computations

  • A: prox-operators of f and g
  • B: solution of a linear equation with coefficient I + t2ATA

10/30

slide-14
SLIDE 14

Example: constrained L1-TV deblurring

minimize Kx − b1 + γDxiso subject to 0 ≤ x ≤ 1

  • Gaussian blur with salt-and-pepper noise; periodic boundary conditions
  • I + KTK + DTD diagonalizable by DFT
  • 1024 × 1024 image
  • riginal

blurred restored

11/30

slide-15
SLIDE 15

Primal Douglas-Rachford splitting

Equivalent problem (δ is indicator function of {0}) minimize f(x) + g(Ax) − → minimize f(x) + g(y)

  • F (x,y)

+ δ(Ax − y)

  • G(x,y)

Algorithm: Douglas-Rachford splitting applied to optimality conditions 0 ∈ ∂F(x, y) + ∂G(x, y) Resolvent computations

  • ∂F requires prox-operators of f, g
  • ∂G requires linear equation with coefficient I + ATA

hence, similar complexity per iteration as primal-dual splitting

12/30

slide-16
SLIDE 16

Alternating direction method of multipliers (ADMM)

Douglas-Rachford applied to dual, after introducing splitting variable u minimize f(x) + g(Ax) − → minimize f(u) + g(y) subject to

  • I

A

  • x −
  • u

y

  • = 0

ADMM: alternating minimization of augmented Lagrangian f(u) + g(y) + wT(x − u) + zT(Ax − y) + t 2

  • x − u2

2 + Ax − y2 2

  • minimization over x: linear equation with coefficient I + ATA
  • minimization over (u, y): prox-operators of f, g

hence, similar complexity per iteration as primal-dual splitting

13/30

slide-17
SLIDE 17

Chambolle-Pock method

0 ∈

  • ∂f(x)

∂g∗(z)

  • +
  • AT

−A x z

  • Algorithm

z+ = proxtg∗(z + tA¯ x+) x+ = proxsf(x − sATz+) ¯ x+ = 2x+ − x

  • convergence requires

√ st < 1/A2

  • no linear equations with A; only multiplications with A and AT

14/30

slide-18
SLIDE 18

Convergence

200 400 600 800 1000 10

−7

10

−6

10

−5

10

−4

10

−3

10

−2

10

−1

10

iteration number k (f(xk) − f ⋆)/f ⋆

CP ADMM primal DR primal−dual DR

∼ 1.4 seconds per iteration for each method

15/30

slide-19
SLIDE 19

Additive structure in A

minimize f(x) + g(Ax) maximize −f ∗(−ATz) − g∗(z)

  • f, g have inexpensive prox-operators
  • A = B + C with structured B and C: equations with coefficients

I + BTB, I + CTC are easy to solve, but not I + ATA Extended primal-dual optimality conditions 0 ∈     ∂g(y) ∂f ∗(w)     +     AT I −I −A I −I         x y z w    

16/30

slide-20
SLIDE 20

Primal-dual splitting

∈     ∂g(y) ∂f ∗(w)     +     BT −B         x y z w    

  • A(x,y,z,w)

+     CT I −I −C I −I         x y z w    

  • B(x,y,z,w)

Resolvent computations

  • A: prox-operators of f, g, linear equation I + t2BTB
  • B: linear equation with coefficient I +

t2 (1+t2)2CTC

17/30

slide-21
SLIDE 21

TV-L1 deblurring with replicate boundary conditions

minimize (Kc + Ks)x − b1 + γ(Dc + Ds)xiso subject to 0 ≤ x ≤ 1

  • Kc, Dc: operators for periodic boundary conditions
  • Ks, Ds: sparse correction for replicate boundary conditions

blurry, noisy image deblurred using periodic b.c. deblurred using replicate b.c.

18/30

slide-22
SLIDE 22

Handling replicate boundary conditions

K = Kc + Ks, D = Dc + Ds

  • Kc, Dc: operators assuming periodic boundary conditions
  • I + KT

c Kc + DT c Dc is diagonalized by DFT

  • E = I + KT

s Ks + DT s Ds is sparse

1 2 3 4 5 6 x 10

4

1 2 3 4 5 6 x 10

4 nz = 463680

1 2 3 4 5 6 x 10

4

1 2 3 4 5 6 x 10

4 nz = 592892

pattern of E Cholesky factor of E

19/30

slide-23
SLIDE 23

Primal Douglas-Rachford splitting

Equivalent problem: introduce splitting variables ˜ x, ˜ y minimize f(x) + g(y + ˜ y) + δ(x − ˜ x)

  • F (x,˜

x,y,˜ y)

+ δ(Bx − y) + δ(C˜ x − ˜ y)

  • G(x,˜

x,y,˜ y)

and apply Douglas-Rachford method to find zero of 0 ∈ ∂F(x, ˜ x, y, ˜ y) + ∂G(x, ˜ x, y, ˜ y) Resolvent computations

  • ∂F: require prox-operators of f, g
  • ∂G: linear equations with coefficients I + BTB, I + CTC

more variables but similar complexity per iteration as primal-dual splitting

20/30

slide-24
SLIDE 24

ADMM

Equivalent problem: introduce another splitting variable u minimize f(u) + g(y + ˜ y) subject to     I I B C    

  • x

˜ x

    I I I I       u y ˜ y   = 0 ADMM: alternating minimization of augmented Lagrangian requires

  • linear equations with coefficients I + BTB, I + CTC
  • prox-operators of f, g

even more variables, but same complexity per iteration

21/30

slide-25
SLIDE 25

Convergence

500 1000 1500 2000 10

−5

10

−4

10

−3

10

−2

10

−1

10 10

1

iteration number k (f(xk) − f ⋆)/f ⋆

CP ADMM primal DR primal−dual DR

∼ 0.1 seconds per iteration

22/30

slide-26
SLIDE 26

Domain decomposition

  • divide image in rectangular regions
  • blurring and derivative operators are block-diagonal plus sparse
  • diagonal blocks are operations on regions, with periodic boundary conds.

Example: constrained TV-L1 deblurring on 256 × 256 image blurry, noisy image after 5 iterations restored image

23/30

slide-27
SLIDE 27

Tight-frame regularization

minimize 1 2Kx − b2

2 + γDx1

  • K = Kc + Ks for blurring with replicate boundary conditions
  • D is shearlet tight frame: satisfies DTD = αI
  • 256 × 256 image

noisy, blurred restored

24/30

slide-28
SLIDE 28

Outline

  • Introduction
  • Douglas-Rachford splitting method
  • Primal-dual splitting
  • Space-varying blur
slide-29
SLIDE 29

Space-varying blurring

Blurring model (Nagy and O’Leary, 1998) K = U1K1 + U2K2 + · · · + UmKm

  • Ki are blurring matrices for space invariant kernels
  • Ui are positive diagonal matrices with U1 + · · · + Um = I
  • K is not diagonalizable by DFT or DCT

Convex deblurring problem minimize φf(Kx − b) + φs(Dx) + φr(x)

  • we assume φf is separable (e.g., squared Euclidean norm, L1-norm)
  • D is tight frame or derivative operator

25/30

slide-30
SLIDE 30

Splitting method

minimize φf(Kx − b) + φs(Dx) + φr(x) As composite problem: minimize f(x) + g(Ax) with f(x) = φr(x), g(y1, . . . , ym+1) = φf(U1y1 + · · · + Umym) + φs(ym+1) A =

  • KT

1

· · · KT

m

DT T

  • I + ATA is diagonalizable by a DFT, hence easy to invert:

I + ATA = I +

m

  • i=1

KT

i Ki + DTD

  • prox-operators of f and g reduce to prox-operators of φr, φf, φs

26/30

slide-31
SLIDE 31

Example

  • 512 × 512 output image, 528 × 528 input (free boundary conditions)
  • m = 4 kernels (one for each quadrant of the image)

noisy, blurred L2-TV deblurred ∼ 0.2 seconds per iteration (cost of a small number of FFTs)

27/30

slide-32
SLIDE 32

Convergence

200 400 600 800 1000 10

−5

10 10

5

10

10

10

15

iteration number k (F k − F ⋆)/F ⋆

CP DR

∼ 0.2 seconds per iteration (cost of a small number of FFTs)

28/30

slide-33
SLIDE 33

Motion deblurring

Example and software from Chakrabarti, Zickler, Freeman (2010)

  • 367 × 600 image
  • algorithm estimates motion blur kernel and segments out blurred region

image with motion blur restored image

  • segmentation used to build Nagy-O’Leary model with two kernels
  • L2-TV deblurring using primal-dual Douglas-Rachford method

29/30

slide-34
SLIDE 34

Summary

Douglas-Rachford splitting applied to primal-dual optimality conditions of minimize f(x) + g(Ax)

  • f and g have inexpensive prox-operators
  • A is structured: I + ATA is easy to invert
  • extension: A = B + C with I + BTB, I + CTC easy to invert
  • applications in image deblurring
  • extends primal-dual decomposition (f, g separable, A angular + sparse)

30/30