Phase synchronization An example of global optimality on manifolds - - PowerPoint PPT Presentation

β–Ά
phase synchronization
SMART_READER_LITE
LIVE PREVIEW

Phase synchronization An example of global optimality on manifolds - - PowerPoint PPT Presentation

Phase synchronization An example of global optimality on manifolds Nicolas Boumal, Inria & ENS Paris Joint work with Afonso Bandeira and Amit Singer One-day workshop on Riemannian and nonsmooth optimization Sept. 25, 2015 Note for the


slide-1
SLIDE 1

Phase synchronization

An example of global optimality

  • n manifolds

Nicolas Boumal, Inria & ENS Paris Joint work with Afonso Bandeira and Amit Singer

One-day workshop on Riemannian and nonsmooth optimization

  • Sept. 25, 2015
slide-2
SLIDE 2

Note for the reader

A tighter version of the theorems in these slides will appear in an update to the arxiv paper 1411.3272 (version 1 from

  • Nov. 2014 is less tight).

The version in these slides was chosen as it is relatively simple to derive in the allotted time. In particular, the bound on 𝑨 βˆ’ 𝑦 2 can be reduced to 𝑃(𝜏) and the tightness rate can be improved to 𝜏 ≀ 𝑑 β‹… π‘œ1/4 as a result. Nicolas, September 28, 2015.

slide-3
SLIDE 3

The goal: estimate individual orientations, from pairwise comparisons (up to global shift)

slide-4
SLIDE 4

The goal: estimate individual orientations, from pairwise comparisons (up to global shift)

slide-5
SLIDE 5

Estimate phases from relative info

Unknowns

𝑨1, … , π‘¨π‘œ ∈ β„‚, with 𝑨1 = β‹― = π‘¨π‘œ = 1

Data

Noisy measurements of relative phases: π·π‘—π‘˜ = 𝑨𝑗𝑨

π‘˜ βˆ— + πœπ‘‹ π‘—π‘˜

β„‚1

π‘œ

slide-6
SLIDE 6

What does additive Gaussian noise mean here?

Noise affects the phase of the measurement

π·π‘—π‘˜ = 𝑨𝑗𝑨

π‘˜ βˆ— + πœπ‘‹ π‘—π‘˜

slide-7
SLIDE 7

Maximum likelihood estimation

𝑨 ∈ β„‚1

π‘œ,

𝐷 = π‘¨π‘¨βˆ— + πœπ‘‹

Under Gaussian noise, the MLE solves a least-squares

min

π‘¦βˆˆβ„‚1

π‘œ

𝐷 βˆ’ π‘¦π‘¦βˆ—

F 2

≑ max

π‘¦βˆˆβ„‚1

π‘œ π‘¦βˆ—π·π‘¦

slide-8
SLIDE 8

𝑦 (the MLE) 𝑨 (the signal) β„“2 ball of radius 12𝜏 π‘œ level sets of π‘¦βˆ—π·π‘¦

slide-9
SLIDE 9

Lemma: The MLE 𝑦 is close to the signal 𝑨

The MLE is more likely than the signal: π‘¨βˆ—π·π‘¨ ≀ π‘¦βˆ—π·π‘¦. Hence, with 𝐷 = π‘¨π‘¨βˆ— + πœπ‘‹ and 𝑋 op ≀ 3 π‘œ, π‘œ2 + πœπ‘¨βˆ—π‘‹π‘¨ ≀ π‘¨βˆ—π‘¦ 2 + πœπ‘¦βˆ—π‘‹π‘¦ π‘œ2 βˆ’ π‘¨βˆ—π‘¦ 2 ≀ 𝜏 π‘¦βˆ—π‘‹π‘¦ βˆ’ π‘¨βˆ—π‘‹π‘¨ ≀ 𝜏 β‹… 2π‘œ 𝑋 op ≀ 6πœπ‘œ3/2 Divide by π‘œ + π‘¨βˆ—π‘¦ β‰₯ π‘œ. Implies: min

πœ„

𝑨 βˆ’ π‘¦π‘“π‘—πœ„

2 2 = 2(π‘œ βˆ’ |π‘¨βˆ—π‘¦|) ≀ 12𝜏 π‘œ

slide-10
SLIDE 10

The MLE is NP-hard to compute

The problem max

π‘¦βˆˆβ„‚1

π‘œ π‘¦βˆ—π·π‘¦

has a quadratic cost π‘¦βˆ—π·π‘¦, and nonconvex quadratic constraints 𝑦𝑗 2 = 1.

slide-11
SLIDE 11

Surprisingly, global optimality can be tested (sometimes)

Given 𝑦 ∈ β„‚1

π‘œ, consider

𝑇 = 𝑇 𝑦 = β„œ ddiag π·π‘¦π‘¦βˆ— βˆ’ 𝐷 If 𝑇 is positive semidefinite, then 𝑦 is optimal: βˆ€π‘§ ∈ β„‚1

π‘œ, 0 ≀ π‘§βˆ—π‘‡π‘§ = π‘¦βˆ—π·π‘¦ βˆ’ π‘§βˆ—π·π‘§

slide-12
SLIDE 12

Optimization on manifolds finds a certified optimum quite often

Noise Level 𝜏 Number of phases π‘œ

Certified optimum: 𝑇 ≽ 0 Don’t know

slide-13
SLIDE 13

How is that possible?

The problem is convex in a lifted space

slide-14
SLIDE 14

Classic lifting trick: rewrite everything in terms of π‘Œ = π‘¦π‘¦βˆ—

max

π‘¦βˆˆβ„‚1

π‘œ π‘¦βˆ—π·π‘¦

The cost π‘¦βˆ—π·π‘¦ = Trace π‘¦βˆ—π·π‘¦ = Trace(π·π‘Œ) The constraints 𝑦𝑗 2 = 1 ⇔ 𝑦𝑗𝑦𝑗

βˆ— = 1 ⇔ π‘Œπ‘—π‘— = 1

The knot βˆƒπ‘¦: π‘Œ = π‘¦π‘¦βˆ— ⇔ π‘Œ ≽ 0, rank π‘Œ = 1

slide-15
SLIDE 15

Suggests a semidefinite relaxation

Recast the QCQP max

π‘¦βˆˆβ„‚1

π‘œ π‘¦βˆ—π·π‘¦

Into max

π‘Œβˆˆβ„‚π‘œΓ—π‘œ Trace(π·π‘Œ)

diag π‘Œ = 𝟐 π‘Œ ≽ 0 rank π‘Œ = 1

slide-16
SLIDE 16

Suggests a semidefinite relaxation

Relax the QCQP max

π‘¦βˆˆβ„‚1

π‘œ π‘¦βˆ—π·π‘¦

Into the SDP max

π‘Œβˆˆβ„‚π‘œΓ—π‘œ Trace(π·π‘Œ)

diag π‘Œ = 𝟐 π‘Œ ≽ 0 rank π‘Œ = 1

slide-17
SLIDE 17

The SDP seems tight roughly when Riemannian optimization succeeds

Noise Level 𝜏 Number of phases π‘œ

Tight: SDP solution has rank 1 Not tight

slide-18
SLIDE 18

We prove the SDP has a unique solution of rank 1

This is with high probability for large π‘œ, And if 𝜏 ≀ 𝑑 β‹… π‘œ1/6 (empirically, 𝑃(π‘œ1/2) ok).

slide-19
SLIDE 19

General idea: dual certification

Lemma:

π‘Œ solves the SDP if and only if 𝑇 π‘Œ = β„œ ddiag π·π‘Œ βˆ’ 𝐷 ≽ 0 Proof via KKT conditions. Thus, the certificate works iff the SDP is tight.

max

π‘Œβˆˆβ„‚π‘œΓ—π‘œ Trace(π·π‘Œ)

diag π‘Œ = 𝟐 π‘Œ ≽ 0

slide-20
SLIDE 20

General idea: dual certification

Let 𝑦 be the MLE, solution of max

π‘¦βˆˆβ„‚1

π‘œ π‘¦βˆ—π·π‘¦

We aim to prove that 𝑇 π‘¦π‘¦βˆ— ≽ 0. Challenge: we don’t know 𝑦.

slide-21
SLIDE 21

𝑦 (the MLE) 𝑨 (the signal) level sets of π‘¦βˆ—π·π‘¦

Step 1: characterize the MLE 𝑦

𝑦 is second-order critical, close to 𝑨.

slide-22
SLIDE 22

Step 1: characterize the MLE 𝑦

The MLE problem lives on a smooth manifold max

π‘¦βˆˆβ„‚1

π‘œ π‘¦βˆ—π·π‘¦

𝑦 is critical simply if its projected gradient vanishes: 𝑦 is critical ⇔ 𝑇𝑦 = 0 𝑇 = 𝑇 𝑦 = β„œ ddiag π·π‘¦π‘¦βˆ— βˆ’ 𝐷

slide-23
SLIDE 23

Step 1: characterize the MLE 𝑦

𝑇 = 𝑇 𝑦 = β„œ ddiag π·π‘¦π‘¦βˆ— βˆ’ 𝐷 𝑦 is critical ⇔ 𝑇𝑦 = 0 ⇔ diag π·π‘¦π‘¦βˆ— is real If 𝑦 is also second-order critical, then 𝑇 is positive semidefinite on the tangent space. 𝑦 is second-order critical β‡’ diag π·π‘¦π‘¦βˆ— β‰₯ 𝟐

slide-24
SLIDE 24

Step 2: certify

Assuming 𝑦 is second-order critical and close to 𝑨, show that 𝑇 = ddiag π·π‘¦π‘¦βˆ— βˆ’ 𝐷 ≽ 0. For all 𝑣 ∈ β„‚π‘œ with π‘£βˆ—π‘¦ = 0, seven lines give: π‘£βˆ—π‘‡π‘£ β‰₯ 𝑣 2

2 π‘œ βˆ’ 𝜏 21 π‘œ + 𝑋𝑦 ∞

(Used 𝑋 op ≀ 3 π‘œ again.)

slide-25
SLIDE 25

Step 2: certify

Sufficient condition for tightness of the SDP: π‘œ β‰₯ 𝜏 21 π‘œ + 𝑋𝑦 ∞ Note: 𝑦 and 𝑋 are not independent. Suboptimal argument (whp): 𝑋𝑦 ∞ ≀ 𝑋𝑨 ∞ + 𝑋 𝑦 βˆ’ 𝑨

∞

≀ π‘œ log π‘œ + 𝑋 op 𝑦 βˆ’ 𝑨 2 ≀ π‘œ log π‘œ + 3 12πœπ‘œ3/2

slide-26
SLIDE 26

Theorem (ArXiv 1411.3272)

With high probability for large (finite) π‘œ, If 𝜏 ≀ 𝑑 β‹… π‘œ1/6, Then a second-order critical point 𝑦 close to 𝑨 is optimal, with certificate 𝑇, And π‘¦π‘¦βˆ— is the unique solution of the SDP.