Projection methods: convergence and counterexamples 4 January 2019 - - PowerPoint PPT Presentation

projection methods
SMART_READER_LITE
LIVE PREVIEW

Projection methods: convergence and counterexamples 4 January 2019 - - PowerPoint PPT Presentation

Projection methods: convergence and counterexamples 4 January 2019 Hangzhou Dianzi University Vera Roshchina School of Mathematics and Statistics UNSW Sydney v.roshchina@unsw.edu.au Based on joint work with Hong-Kun Xu , Roberto Cominetti


slide-1
SLIDE 1

Projection methods:

convergence and counterexamples

4 January 2019 Hangzhou Dianzi University

Vera Roshchina School of Mathematics and Statistics UNSW Sydney v.roshchina@unsw.edu.au Based on joint work with Hong-Kun Xu, Roberto Cominetti and Andrew Williamson.

slide-2
SLIDE 2

The method of alternating projections

C1 C2

slide-3
SLIDE 3

The method of alternating projections

Let H be a Hilbert space, with inner product ·, · and norm · . For any closed convex set C ⊆ H and any x ∈ H there exists a unique point PC(x) ∈ C such that x − PC(x) = inf

y∈C x − y.

Given two closed convex sets C1, C2 ⊆ H and x0 ∈ H, let x1 = PC1(x0), x2 = PC2(x1), x3 = PC1(x2), x4 = PC2(x3), . . . . . . x2k+1 = PC1x2k, x2k+2 = PC2x2k+1, . . . . . .

slide-4
SLIDE 4

Convergence

Let M1 and M2 be closed affine subspaces of H, M = M1 ∩ M2. Theorem 1 (von Neumann 1933). For each x ∈ H lim

n→∞ (PM2PM1)n(x) − PM(x) = 0. von Neumann, Functional Operators-Vol. II. The Geometry of Orthogonal Spaces, Annals of Math. Studies, 1950 (reprint of 1933 lectures).

Theorem 2 (Bregman 1965). For C = C1 ∩ C2 = ∅, where C1, C2 ⊆ H are closed convex sets, the sequence of alternating projections converges weakly to a point in C.

Bregman, The method of successive projection for finding a common point of convex sets, Sov. Math. Dokl., 1965.

The question of whether convergence is always strong remained

  • pen until 2004, despite many works on sufficient conditions.
slide-5
SLIDE 5

Counterexample of Hundal

Theorem 3 (Hundal 2004). There exist a Hilbert space H, closed convex sets C1, C2 ⊂ H with intersection C1 ∩ C2 = {0} and a starting point x0 such that lim

n→∞ (PC2PC1)n(x0) > 0.

In a separable Hilbert space with an orthonormal basis {ei}∞

i=1, let

C1 = {x | x, e1 ≤ 0}, C2 = cone {p(t) | t ≥ 0}, p(t) = e⌊t⌋+2 cos(f(t)) + e⌊t⌋+3 sin(f(t)) + e1h(t), t ≥ 0, f(t) = π 2(t − ⌊t⌋), h(t) = e−100t3

Hundal, An alternating projection that does not converge in norm. Nonlinear

  • Anal. 2004.
slide-6
SLIDE 6

Rate of convergence x0

x0

x0 x0

slide-7
SLIDE 7

Angles between subspaces

The Friedrichs angle between two closed linear subspaces M1 and M2 is α ∈ [0, π

2] such that (BH is a unit ball, M = M1 ∩ M2)

c = cos α = sup

x∈M1∩M⊥∩BH y∈M2∩M⊥∩BH

|x, y|. Theorem 4 (Aronszajn, 1950). For each x ∈ H and n ≥ 1 (PM2PM1)n(x) − PM(x) ≤ c2n−1x. We have c < 1 iff M1 + M2 is closed; in this case the method of alternating projections converges linearly.

Aronszajn, Theory of reproducing kernels, Trans. Amer. Math. Soc., 1950.

The constant is the smallest possible Kayalar and Weinert, Error bounds for

the method of alternating projections, Math. Control Signals Systems, 1988.

Generalisations to several sets Reich and Zalas, The optimal error bound for

the method of simultaneous projections, J. Approx. Theory, 2017

slide-8
SLIDE 8

What if c = 1?

Theorem 5 (Bauschke, Borwein and Lewis). For two closed affine subspaces M1, M2 ∈ H exactly one of the alternatives holds. (1) M1+M2 is closed. Then for each x the alternating projections converge linearly to PM1∩M2(x) with a rate c2. (2) M1 + M2 is not closed. Then for any sequence of positive real numbers 1 > λ1 ≥ λ2 ≥ · · · ≥ λn → 0 there exists a point xλ ∈ H such that (PM2PM1)n(xλ) − PM(xλ) ≥ λn ∀n ∈ N.

Bauschke, Borwein, and Lewis, The method of cyclic projections for closed convex sets in Hilbert space, Contemporary Mathematics, 1997. Bauschke, Deutsch, Hundal, Characterizing arbitrarily slow convergence in the method of alternating projections. Int. Trans. Oper. Res., 2009.

slide-9
SLIDE 9

Special properties and convergence

Regularity and the existence of Slater points

Gubin, Polyak, Raik, The method of projections for finding the common point of convex sets, USSR Comput. Math. Math. Phys., 1967.

Symmetry

Bruck, Reich, Nonexpansive projections and resolvents of accretive operators in Banach spaces, Houston J. Math., 1977. Reich, A limit theorem for projections, Linear and Multilinear Algebra, 1983.

Semialgebraic structure

Borwein, Li, Yao, Analysis of the convergence rate for the cyclic projection algo- rithm applied to basic semialgebraic convex sets. SIAM J. Optim. 24, 498–527 (2014) Drusvyatskiy, Li, Wolkowicz, A note on alternating projections for ill-posed semidef- inite feasibility problems. Math. Program. 162 (2017), 537–548.

slide-10
SLIDE 10

What if the problem is infeasible?

Assume that C1, C2 ∈ H are convex and closed, but possibly C1 ∩ C2 = ∅. Define the distance between C1 and C2 as dist(C1, C2) = inf

x∈C1 y∈C2

y − x. The following sets may be empty, P1 = {x ∈ C1 | dist(x, C2) = dist(C1, C2)}, P2 = {y ∈ C2 | dist(y, C1) = dist(C1, C2)}.

C1 C2 v P2 P1

C1 C2 v

slide-11
SLIDE 11

The displacement vector and convergence

Define the displacement vector v = PC2−C1(0), where C2 − C1 is the Minkowski difference, C2 − C1 = {y − x, x ∈ C1, y ∈ C2}. For the alternating projections we have x2k − x2k+1 → v, x2k+2 − x2k+1 → v. If P1 and P2 are empty, then xn → ∞. Otherwise x2k+1 ⇀ ¯ x ∈ P1, x2k ⇀ ¯ y ∈ P2, and ¯ y − ¯ x = v.

Bauschke, Borwein, On the Convergence of yon Neumann’s Alternating Projec- tion Algorithm for Two Sets, Set-Valued Analysis, 1993.

slide-12
SLIDE 12

A helpful illustration

C1 C2

slide-13
SLIDE 13

What about more than two sets?

For m ≥ 2 sets we can generalise alternating projections starting from x0 ∈ H, and projecting cyclically onto each of the sets. For three sets C1, C2, C3, x1 = PC1(x0), x2 = PC2(x1), x3 = PC3(x2), x4 = PC1(x3), · · ·

C1 C2 C3 u0 u1 u2 u6 u4 u5 u3

slide-14
SLIDE 14

There is no variational characterisation

Under mild assumptions (e.g.

  • ne of the sets is bounded) cyclic

projections converge weakly either to a point in the intersection C1 ∩ C2 ∩ · · · ∩ Cm or to a fixed cycle if the intersection is empty.

Bruck, Reich, Nonexpansive projections and resolvents of accretive operators in Banach spaces. Houston J. Math., 1977.

Recall that for two sets this cycle realises the distance between the sets; however, for m ≥ 3 there is no function Φ : Hm → R such that for any collection of compact convex sets C1, C2, . . . , Cm ⊂ H the limit cycles are precisely the solutions to the minimisation problem min

xi∈Ci

Φ(x1, x2, . . . , xm).

Baillon, Combettes, Cominetti, There is no variational characterization of the cy- cles in the method of periodic projections. J. Funct. Anal., 2012.

slide-15
SLIDE 15

Under-relaxed projections

Fix α ∈ (0, 1] and instead of PC(x) consider R(x) = (1 − α)x + αPC(x).

C u

true projection under-relaxed projection

This leads to under-relaxed alternating and cyclic projections.

slide-16
SLIDE 16

Under-relaxed projections

C2 C3 C1

Iterations for α = 0.75 and α = 0.35 (shown in red).

slide-17
SLIDE 17

Two special limits

Fix α ∈ (0, 1] and instead of PC(x) consider R(x) = (1 − α)x + α(PC(x) − x). The under-relaxed cyclic projections converge weakly to a fixed cy- cle iff such a cycle exists (e.g. when one of the sets is bounded).

Bruck, Reich, Nonexpansive projections and resolvents of accretive operators in Banach spaces. Houston J. Math., 1977.

Consider the limit of such α-cycles as α ↓ 0, or alternatively vary α, letting αk ↓ 0,

k∈N αk = +∞.

slide-18
SLIDE 18

De Pierro’s conjecture

Conjecture 1. The least squares solution S = Arg min

x∈H m

  • i=1

min

xi∈Ci

x − xi2 exists iff both limits exist and solve this least squares problem.

De Pierro, From parallel to sequential projection methods and vice versa in convex feasibility: results and conjectures, Stud. Comput. Math., 2001.

The conjecture is true for affine subspaces of Rn,

Censor, Eggermont, Gordon, Strong underrelaxation in Kaczmarz’s method for in- consistent systems. Numer. Math., 1983.

closed affine subspaces satisfying a metric regularity condition,

Bauschke, Edwards, A conjecture by De Pierro is true for translates of regular sub- spaces, J. Nonlinear Convex Anal., 2005.

and sets satisfying a certain geometric condition.

Baillon, Combettes, Cominetti, Asymptotic behavior of compositions of under- relaxed nonexpansive operators, J. Dyn. Games, 2014.

slide-19
SLIDE 19

A misleading example

C1 = co {(−2, 2, 1), (−2, 2, −1)}, C2 = co {(2, 2, 1), (2, 2, −1)}, C3 = {(x, y, z) | x2 + y2 ≤ 1, |z| ≤ 1}, S =

  • 0, 5

3, z

  • : |z| ≤ 1
  • .

S C1 C2 C3 u0 z0=0.5 S C1 C2 C3 u0 z0=-0.5

Under-relaxed projections for α = 0.5 and different starting points.

slide-20
SLIDE 20

Counterexample

C1 = co {(−2, 2, 1), (−2, 2, −1)}, C2 = co {(2, 2, 1), (2, 2, −1)}, C3 = co {pk | k ∈ N}, pk = (cos tk, sin tk, (−1)k). Here {tk} is increasing, t1 = π

4 and tk → π 2 as k → ∞.

C1 C2 C3 p1 p3 p2 p4

slide-21
SLIDE 21

Counterexample

For this three-set system the limits described earlier do not exist, how- ever, the least-squares problem has a solution.

C1 C2 C3 p1 p3 p2 p4

Cominetti, Roshchina, Williamson, A counterexample to De Pierro’s conjecture on the convergence of under-relaxed cyclic projections, Optimization, 2018.

slide-22
SLIDE 22

Reduction to two dimensions

The projections of the two-dimensional cycles correspond to an ‘os- cillating’ path in 3D. As α ↓ 0, limit cycles ‘follow’ this path, and hence there is no convergence to a single point.

a}=C1 { ' b}=C2 { ' v1 v2 v3 C3 '

C1 C2 C3 p1 p3 p2 p4

slide-23
SLIDE 23

Bonus #1: Krasnoselskii-Mann iterations

When T : C → C is a contraction, i.e. for some ρ ∈ [0, 1) we have Tx − Ty ≤ ρx − y ∀x, y ∈ C, for the fixed-point iterations we get the asymptotic regularity, Txn − xn ≤ ρnTx0 − x0 → 0. This is not the case for nonexpansive maps (with ρ = 1). Let T : C → C be a nonexpansive map defined on a convex bounded subset C of a normed space X. Krasnoselski-Mann iterations (for αn ∈ [0, 1]): xn+1 = (1 − αn+1)xn + αn+1Txn.

slide-24
SLIDE 24

Bonus #1: Krasnoselskii-Mann iterations

For a rotation T : R2 → R2 full step xk+1 = T(xk) on the left and xk+1 = (1 − α)xk + αkT(xk) on the right.

slide-25
SLIDE 25

Rate of convergence

Krasnoselskii-Mann iterations: xn+1 = (1 − αn+1)xn + αn+1Txn. Theorem 6. The Krasnoselskii–Mann iterates satisfy Txn − xn ≤ diam C

  • π n

i=1 αi(1 − αi)

. (1)

Cominetti, Soto, Vaisman, On the rate of convergence of Krasnoselskii–Mann iter- ations and their connection with sums of Bernoullis. Israel J. Math., 2014.

Theorem 7. The constant κ = 1/√π in the bound (1) is tight. Specifically, for each κ < 1/√π there exists a nonexpansive map T defined on the unit cube C = [0, 1]N ⊆ l∞(N), an initial point x0 ∈ C, and a constant sequence αn ≡ α, such that the corresponding KM iterates satisfy for some n ∈ N Txn − xn > κ diam C

n

i=1 αi(1 − αi)

.

Bravo, Cominetti, Sharp convergence rates for averaged nonexpansive maps, Isr.

  • J. Math., 2018.
slide-26
SLIDE 26

Bonus #2: Over-relaxed projections

Douglas-Rachford is a variant of projection method that uses reflec- tions and averages instead of projections. The Douglas-Rachford

  • perator is defined as

TA,B := 1 2(I + RBRA), RC := 2PC − I. For the convex setting, the convergence results are very similar to the method of alternating projections. However the Douglas–Rachford method is successfully applied to nonsmooth problems, where its be- haviour is not fully understood. https://carma.newcastle.edu.au/scott/#!page-beauty-in-mathematics

Arag´

  • n Artacho, Borwein, Tam, Global behavior of the Douglas–Rachford method

for a nonconvex feasibility problem. J. Global Optim. 2016 Lindstrom, Sims, Survey: Sixty Years of Douglas–Rachford, 2018 (arxiv preprint)

slide-27
SLIDE 27

Bonus #3: Optimisation with projections

Consider an optimisation problem min f(x) s.t. x ∈ C1 ∩ C2 ∩ · · · ∩ Cm, where f : Rn → R is convex, and C1, . . . , Cm are closed convex sets in Rn. A version of cyclic projections algorithm can be sup- plemented with a gradient (subgradient) step. For example, con- sider sequential, cyclic and parallel projections, starting from some x0 ∈ Rn: xk+1 := PCm · · · PC2PC1(xk − λkvk), xk+1 :=

m

  • j=1

βjPCj(xk − λkvk), xk+1 := PC[k+1](xk − λkvk), [k + 1] = (k mod m) + 1. where vk ∈ ∂f(xk) (when f is smooth, vk = ∇f(xk)).

slide-28
SLIDE 28

Convergence

If the function f is convex, 0 < λk → 0, ∞

k=1 λk = +∞, the feasible

set is nonempty and {xk} is bounded, then for all three methods the sequence {fk} of the function values converges to the optimal value, and every cluster point of {xk} is an optimal solution, given that the solution set is nonempty. This result is also true for composite optimisation problem, when f = f1 + · · · + fN, and the subgradient step is replace by a cycle of subgradient steps involving each one of these functions. Convergence of sequential projections was shown in De Pierro, Neto,

Salom˜ ao, From convex feasibility to convex constrained optimization using block action projection methods and underrelaxation. Int. Trans. Oper. Res. (2009)

Parallel and cyclic versions: Roshchina, Xu, forthcoming preprint, 2019.

slide-29
SLIDE 29

Thank you