Introductory Course on Non-smooth Optimisation Lecture 05 - - - PowerPoint PPT Presentation

introductory course on non smooth optimisation
SMART_READER_LITE
LIVE PREVIEW

Introductory Course on Non-smooth Optimisation Lecture 05 - - - PowerPoint PPT Presentation

Introductory Course on Non-smooth Optimisation Lecture 05 - PeacemanRachford, DouglasRachford splitting Jingwei Liang Department of Applied Mathematics and Theoretical Physics Table of contents 1 Problem 2 PeacemanRachford splitting


slide-1
SLIDE 1

Introductory Course on Non-smooth Optimisation

Lecture 05 - Peaceman–Rachford, Douglas–Rachford splitting Jingwei Liang

Department of Applied Mathematics and Theoretical Physics

slide-2
SLIDE 2

Table of contents

1

Problem

2

Peaceman–Rachford splitting

3

Douglas–Rachford splitting

4

Sum of more than two operators

5

Spingarn’s method of partial inverses

6

Acceleration

7

Numerical experiments

slide-3
SLIDE 3

Sum of two operators

Pr Problem

  • blem

Find x ∈ Rn such that 0 ∈ A(x) + B(x). Assump Assumptions tions A, B : Rn ⇒ Rn are maximal monotone. the resolvents of A, B are simple, i.e. easy to compute. zer(A + B) = ∅.

Jingwei Liang, DAMTP Introduction to Non-smooth Optimisation March 13, 2019

slide-4
SLIDE 4

Outline

1 Problem 2 Peaceman–Rachford splitting 3 Douglas–Rachford splitting 4 Sum of more than two operators 5 Spingarn’s method of partial inverses 6 Acceleration 7 Numerical experiments

slide-5
SLIDE 5

Peaceman–Rachford splitting Peaceman–Rachford splitting

Let z0 ∈ Rn, γ > 0: xk = JγB(zk), yk = JγA(2xk − zk), zk+1 = zk + 2(yk − xk). dates back to 1950s for solving numerical PDEs. the resolvents of A, B are evaluated separately.

Jingwei Liang, DAMTP Introduction to Non-smooth Optimisation March 13, 2019

slide-6
SLIDE 6

How to derive

given x⋆ ∈ zer(A + B), there exists z⋆ ∈ Rn such that

  • z⋆ − x⋆ ∈ γA(x⋆)

x⋆ − z⋆ ∈ γB(x⋆) = ⇒

  • z⋆ ∈ x⋆ + γA(x⋆),

2x⋆ − z⋆ ∈ x⋆ + γB(x⋆). apply the resolvent

  • x⋆ = JγA(z⋆),

x⋆ = JγB(2x⋆ − z⋆). equivalent formulation

  • x⋆ = JγA(z⋆),

z⋆ = z⋆ + 2

  • JγB(2x⋆ − z⋆) − x⋆

. fixed-point iteration

  • xk = JγA(zk),

zk+1 = zk + 2

  • JγB(2xk − zk) − xk
  • .

Jingwei Liang, DAMTP Introduction to Non-smooth Optimisation March 13, 2019

slide-7
SLIDE 7

Fixed-point characterisartion

Fix Fixed-poin ed-point formula

  • rmulation

tion Recall reflection operator RγA = 2JγA − Id. yk = JγA(2xk − zk) = JγA ◦ (2JγB − Id)(zk). For zk, zk+1 = zk + 2(yk − xk) = zk + 2

  • JγA ◦ (2JγB − Id)(zk) − JγB(zk)
  • = 2JγA ◦ (2JγB − Id)(zk) − (2JγB − Id)(zk)

= (2JγA − Id) ◦ (2JγB − Id)(zk). Pr Property

  • perty

RγA = 2JγA − Id, RγB = 2JγB − Id are non-expansive. TPR = RγA ◦ RγB is non-expansive. NB: Cannot guarantee convergence in general.

Jingwei Liang, DAMTP Introduction to Non-smooth Optimisation March 13, 2019

slide-8
SLIDE 8

Convergence

Uniform monotonicity: φ : R+ → [0, +∞] is increasing and vanishes only at 0 u − v, x − y ≥ φ(||x − y||), (x, u), (y, v) ∈ gra(B). If B is uniformly monotone, then zer(A + B) = {x⋆} and fix(TPR) = ∅. Moreover x − y, JγB(x) − JγB(y) ≥ ||JγB(x) − JγB(y)||2 + γφ(||JγB(x) − JγB(y)||). Let z⋆ ∈ fix(TPR), then x⋆ = JγA(z⋆), and ||zk+1 − z⋆||2 = ||RγARγB(zk) − RγARγB(z⋆)||2 ≤ ||(2JγB − Id)(zk) − (2JγB − Id)(z⋆)||2 = ||zk − z⋆||2 − 4zk − z⋆, JγB(zk) − JγB(z⋆) + 4||JγB(zk) − JγB(z⋆)||2 ≤ ||zk − z⋆||2 − 4γφ(||JγB(zk) − JγB(z⋆)||). φ(||zk − z⋆||) → 0 and ||zk − z⋆|| → 0.

Jingwei Liang, DAMTP Introduction to Non-smooth Optimisation March 13, 2019

slide-9
SLIDE 9

Outline

1 Problem 2 Peaceman–Rachford splitting 3 Douglas–Rachford splitting 4 Sum of more than two operators 5 Spingarn’s method of partial inverses 6 Acceleration 7 Numerical experiments

slide-10
SLIDE 10

Douglas–Rachford splitting

To overcome the drawback of Peaceman–Rachford splitting.

Douglas–Rachford splitting

Let z0 ∈ Rn, γ > 0, λ ∈]0, 2[: xk = JγB(zk), yk = JγA(2xk − zk), zk+1 = zk + λ(yk − xk).

Jingwei Liang, DAMTP Introduction to Non-smooth Optimisation March 13, 2019

slide-11
SLIDE 11

How to derive

given x⋆ ∈ zer(A + B), there exists z⋆ ∈ Rn such that

  • z⋆ − x⋆ ∈ γA(x⋆)

x⋆ − z⋆ ∈ γB(x⋆) = ⇒

  • z⋆ ∈ x⋆ + γA(x⋆),

2x⋆ − z⋆ ∈ x⋆ + γB(x⋆). apply the resolvent

  • x⋆ = JγA(z⋆),

x⋆ = JγB(2x⋆ − z⋆). equivalent formulation

  • x⋆ = JγA(z⋆),

z⋆ = z⋆ +

  • JγB(2x⋆ − z⋆) − x⋆

. fixed-point iteration

  • xk = JγA(zk),

zk+1 = zk +

  • JγB(2xk − zk) − xk
  • .

Jingwei Liang, DAMTP Introduction to Non-smooth Optimisation March 13, 2019

slide-12
SLIDE 12

Fixed-point characterisartion

Fix Fixed-poin ed-point formula

  • rmulation

tion Same as PR, yk = JγA ◦ RγB(zk) zk+1 = (1 − λ)zk + λ

  • zk + (yk − xk)
  • = (1 − λ)zk + λ

1

2zk + 1 2(zk + 2(yk − xk))

  • = (1 − λ)zk + λ 1

2(Id + RγA ◦ RγB)(zk).

Pr Property

  • perty

TDR = 1

2(Id + RγA ◦ RγB) is firmly non-expansive.

T λ

DR = (1 − λ)Id + λTDR is λ

2 -averaged non-expansive.

Peaceman–Rachford is the limiting case of Douglas–Rachford, λ = 2. NB: guaranteed convergence if λ(2 − λ) > 0.

Jingwei Liang, DAMTP Introduction to Non-smooth Optimisation March 13, 2019

slide-13
SLIDE 13

Convergence rate

Jingwei Liang, DAMTP Introduction to Non-smooth Optimisation March 13, 2019

slide-14
SLIDE 14

Convergence rate

Let X, Y be two subspaces X = {x : ax = 0}, Y = {x : bx = 0} and assume 1 ≤ p

def

= dim(X) ≤ q

def

= dim(Y) ≤ n − 1. Projection onto subspace PX (x) = x − aT(aaT)−1ax. Define diagonal matrices c = diag

  • cos(θ1), · · · , cos(θp)
  • ,

s = diag

  • sin(θ1), · · · , sin(θp)
  • .

Jingwei Liang, DAMTP Introduction to Non-smooth Optimisation March 13, 2019

slide-15
SLIDE 15

Convergence rate

Suppose p + q < n, then there exists orthogonal matrix U such that PX = U      Idp 0p 0q−p 0n−p−q      U∗ and PY = U      c2 cs cs c2 Idq−p 0n−p−q      U∗.

Jingwei Liang, DAMTP Introduction to Non-smooth Optimisation March 13, 2019

slide-16
SLIDE 16

Convergence rate

For the composition PX ◦ PY = U      c2 cs 0p 0q−p 0n−p−q      U∗ and PX ⊥ ◦ PY⊥ = U      0p −cs c2 0q−p Idn−p−q      U∗.

Jingwei Liang, DAMTP Introduction to Non-smooth Optimisation March 13, 2019

slide-17
SLIDE 17

Convergence rate

Fixed-point operator TDR = PX ◦ PY + PX ⊥ ◦ PY⊥ = U      c2 cs −cs c2 0q−p Idn−p−q      U∗. Consider relaxation T λ

DR = (1 − λ)Id + λTDR

= U      Idp − λs2 λcs −λcs Idp − λs2 (1 − λ)Idq−p Idn−p−q      U∗.

Jingwei Liang, DAMTP Introduction to Non-smooth Optimisation March 13, 2019

slide-18
SLIDE 18

Convergence rate

Eigenvalues σ(T λ

DR ) =

  • 1 − λsin2(θi) ± iλcos(θi)sin(θi)|i = 1, ..., p
  • ∪ {1} : q = p,
  • 1 − λsin2(θi) ± iλcos(θi)sin(θi)|i = 1, ..., p
  • ∪ {1} ∪ {1 − λ} : q > p.

Complex eigenvalues |1 − λsin2(θi) ± iλcos(θi)sin(θi)| =

  • λ(2 − λ)cos2(θi) + (1 − λ)2

and 1 ≥

  • λ(2 − λ)cos2(θi) + (1 − λ)2 ≥ |1 − λ|.

limk→+∞ T k

DR = T ∞ DR and zk − z⋆ = (TDR − T ∞ DR )(zk−1 − z⋆).

Spectral radius, minimises at λ = 1 ρ(TDR − T ∞

DR ) =

  • λ(2 − λ)cos2(θi) + (1 − λ)2.
  • TDR = TDR − T ∞

DR

||zk − z⋆|| = || TDRzk−1 − TDRz⋆|| = ... = || TDR

k(z0 − z⋆)||

≤ C

  • ρ(

TDR) k||z0 − z⋆||.

Jingwei Liang, DAMTP Introduction to Non-smooth Optimisation March 13, 2019

slide-19
SLIDE 19

Optimal metric for DR

X and Y X ′ and Y Op Optimal timal me metric tric A invertable operation which makes the Friedrichs angle between X ′ and Y the largest, e.g. π

2 ... Jingwei Liang, DAMTP Introduction to Non-smooth Optimisation March 13, 2019

slide-20
SLIDE 20

Outline

1 Problem 2 Peaceman–Rachford splitting 3 Douglas–Rachford splitting 4 Sum of more than two operators 5 Spingarn’s method of partial inverses 6 Acceleration 7 Numerical experiments

slide-21
SLIDE 21

More than two operators

Pr Problem

  • blem s ∈ N+ and s ≥ 2

Find x ∈ Rn such that 0 ∈

i Ai(x).

Assump Assumptions tions for each i = 1, ..., s, Ai : Rn ⇒ Rn is maximal monotone. zer(

iAi) = ∅. Jingwei Liang, DAMTP Introduction to Non-smooth Optimisation March 13, 2019

slide-22
SLIDE 22

Product space

Let H = Rn × · · · × Rn

  • s times

endowed with the scalar inner-product and norm ∀x, y ∈ H,

  • x, y
  • = s

i=1xi, yi, |

| | |x| | | | = s

i=1||xi||2.

Let S = {x = (xi)i ∈ H : x1 = · · · = xs} and its orthogonal complement S⊥ = {x = (xi)i ∈ H : s

i=1xi = 0}. Jingwei Liang, DAMTP Introduction to Non-smooth Optimisation March 13, 2019

slide-23
SLIDE 23

Equivalent formulation

Define A by A(x) : x ∈ H → A1(x1) × · · · × As(xs). Lift Lifted ed pr problem

  • blem

Find x ∈ H such that 0 ∈ A(x) + NS(x). the resolvent of A is seperable, i.e. JγA = (JγAi)i. define the canonical isometry, C : Rn → S, x → (x, · · · , x), then PS(z) = C( 1

s

s

i=1 zi). Jingwei Liang, DAMTP Introduction to Non-smooth Optimisation March 13, 2019

slide-24
SLIDE 24

Outline

1 Problem 2 Peaceman–Rachford splitting 3 Douglas–Rachford splitting 4 Sum of more than two operators 5 Spingarn’s method of partial inverses 6 Acceleration 7 Numerical experiments

slide-25
SLIDE 25

Problem

DR DR in in pr product

  • duct space

space for x⋆ ∈ S, ∃ − v ∈ S such that −v ∈ S⊥ = NS(x⋆) and v ∈ A(x⋆). Pr Problem

  • blem V is a close subspace

Find x ∈ V and v ∈ V⊥ such that v ∈ A(x). Assump Assumptions tions A : Rn ⇒ Rn is maximal monotone. admits at least one solution.

Jingwei Liang, DAMTP Introduction to Non-smooth Optimisation March 13, 2019

slide-26
SLIDE 26

Partial inverse Partial inverse

Let A : Rn ⇒ Rn be set-valued and V ⊆ Rn be a closed subspace. The partial inverse of A respect to V is the operator AV : Rn ⇒ Rn define by gra(AV) =

  • PV(x) + PV⊥(u), PV⊥(x) + PV(u)
  • : (x, u) ∈ gra(A)
  • .

Ex Example ample Let A : Rn ⇒ Rn, then ARn = A and A{0} = A−1.

Jingwei Liang, DAMTP Introduction to Non-smooth Optimisation March 13, 2019

slide-27
SLIDE 27

Spingarn’s method of partial inverses

An application of Proximal Point Algorithm.

Spingarn

Let x0 ∈ V, u0 ∈ V⊥: yk = JA(xk + uk), vk = xk + uk − yk, (xk+1, uk+1) =

  • PV(yk), PV⊥(vk)
  • .

Jingwei Liang, DAMTP Introduction to Non-smooth Optimisation March 13, 2019

slide-28
SLIDE 28

Fixed-point characterisation

define mapping L : Rn ⊕ Rn → Rn ⊕ Rn : (x, u) →

  • PV(x) + PV⊥(u), PV⊥(x) + PV(u)
  • .

p = JAV(x) ⇐ ⇒ (p, x − p) ∈ gra(AV) ⇐ ⇒ L(p, x − p) ∈ L

  • gra(AV)
  • = gra(A)

⇐ ⇒

  • PV(p) + PV⊥(x − p), PV(x − p) + PV⊥(p)
  • ∈ gra(A).

let q = PV(p) + PV⊥(x − p) p = JAV(x) ⇐ ⇒ x − q = PV(x − p) + PV⊥p ∈ A(q) ⇐ ⇒ q = JA(x). let zk = xk + uk, since xk ∈ V and uk ∈ V⊥ PV(zk+1) + PV⊥(zk − zk+1) = xk+1 + PV⊥(uk) − uk+1 = PV(yk) + PV⊥(vk − xk + yk) − PV⊥(vk) = PV(yk) + PV⊥(vk) + PV⊥(yk) − PV⊥(vk). zk+1 = JA(zk).

Jingwei Liang, DAMTP Introduction to Non-smooth Optimisation March 13, 2019

slide-29
SLIDE 29

Outline

1 Problem 2 Peaceman–Rachford splitting 3 Douglas–Rachford splitting 4 Sum of more than two operators 5 Spingarn’s method of partial inverses 6 Acceleration 7 Numerical experiments

slide-30
SLIDE 30

Inertial DR splitting An inertial DR splitting

Initial Initial : x0 ∈ Rn, x−1 = x0 and γ > 0; yk = zk + a0,k(zk − zk−1) + a1,k(zk−1 − zk−2) + · · · , zk+1 = TDR(yk) relaxation can be applied.

Jingwei Liang, DAMTP Introduction to Non-smooth Optimisation March 13, 2019

slide-31
SLIDE 31

Outline

1 Problem 2 Peaceman–Rachford splitting 3 Douglas–Rachford splitting 4 Sum of more than two operators 5 Spingarn’s method of partial inverses 6 Acceleration 7 Numerical experiments

slide-32
SLIDE 32

Example: basis pursuit

Basis Basis pur pursuit suit min

x∈Rn

||x||1 such that Ax = b, A : Rn → Rm with m << n. b ∈ Img(A).

Jingwei Liang, DAMTP Introduction to Non-smooth Optimisation March 13, 2019

slide-33
SLIDE 33

Example: image inpainting

Imag Image inpain inpainting ting min

X∈Rn×n

||WX||1 such that PΩ(X) = ¯ X, W: total variation, orthonomal basis, redundant wavelet frame. Observation constraint

  • PΩ(X)
  • i,j =

¯ Xi,j : (i, j) ∈ Ω, 0 : (i, j) / ∈ Ω. Painting reconstruction in museum.

Jingwei Liang, DAMTP Introduction to Non-smooth Optimisation March 13, 2019

slide-34
SLIDE 34

Example: matrix completion

Ma Matrix trix comple

  • mpletion

tion min

X∈Rn×n

||X||∗ such that PΩ(X) = ¯ X, Observation constraint

  • PΩ(X)
  • i,j =

¯ Xi,j : (i, j) ∈ Ω, 0 : (i, j) / ∈ Ω. Netflix prize, recommendation system.

Jingwei Liang, DAMTP Introduction to Non-smooth Optimisation March 13, 2019

slide-35
SLIDE 35

Example: variation ineuality

Varia ariation tion ineuality ineuality Find x ∈ Rn such that ∃u ∈ A(x), ∀y ∈ Rn : x − y, u + R(x) ≤ R(y). R ∈ Γ0. A : Rn ⇒ Rn is maximal monotone. Ex Example ample Let R, J ∈ Γ0, and x⋆ ∈ Argmin(R + J), then ∃u ∈ ∂J(x⋆) s.t. −u ∈ ∂R(x⋆) and y − x⋆, −u + R(x⋆) ≤ R(y) ⇐ ⇒ x⋆ − y, u + R(x⋆) ≤ R(y).

Jingwei Liang, DAMTP Introduction to Non-smooth Optimisation March 13, 2019

slide-36
SLIDE 36

Numerical experiment

50 100 150 200 250 300 350 400 450 500 10-10 10-8 10-6 10-4 10-2 100 Douglas--Rachford 1-step inertial DR 2-step inertial DR

Comparison Tracjectory

Jingwei Liang, DAMTP Introduction to Non-smooth Optimisation March 13, 2019

slide-37
SLIDE 37

Reference

  • H. H. Bauschke, J. Y. Bello Cruz, T. T. A. Nghia, H. M. Pha, and X. Wang. “Optimal rates of linear

convergence of relaxed alternating projections and generalized Douglas–Rachford methods for two subspaces”. Numerical Algorithms, 73(1):33–76, 2016.

  • H. Bauschke and P. L. Combettes. “Convex Analysis and Monotone Operator Theory in Hilbert

Spaces”. Springer, 2011.

  • J. Liang. “Convergence rates of first-order operator splitting methods”. Diss. Normandie

Université; GREYC CNRS UMR 6072, 2016.