Beyond Compressed Sensing: The Effectiveness of Convex Programming - PowerPoint PPT Presentation

Robust PCA Data increasingly high dimensional Gross errors frequently occur in many applications Image processing Occlusions Web data analysis Malicious tampering Bioinformatics Sensor failures ... ...   ❆ x 12 . . . x 1 n ❆ x 21 x 22 . . .     . . . .  . . . .  . . . .   ❆ x d 1 . . . x dn Important to make PCA robust

Gross errors Movies   × × ❆ ❆     ❆ ×   Users   × ×     ×   ❆ × Observe corrupted entries Y ij = L ij + S ij ( i, j ) ∈ Ω obs L low-rank matrix S entries that have been tampered with (impulsive noise) Problem Recover L from missing and corrupted samples

When does separation make sense? M = L + S   ∗ 0 0 0 · · · 0 0 ∗ 0 0 0 · · · 0 0     Sparse component cannot be low rank: S = . . . . . .  . . . . . .  . . . . . .   ∗ 0 0 0 · · · 0 0 Sparsity pattern will be assumed (uniform) random

When does separation make sense? M = L + S   ∗ 0 0 0 · · · 0 0 ∗ 0 0 0 · · · 0 0     Sparse component cannot be low rank: S = . . . . . .  . . . . . .  . . . . . .   ∗ 0 0 0 · · · 0 0 Sparsity pattern will be assumed (uniform) random   ∗ ∗ ∗ ∗ · · · ∗ ∗ 0 0 0 0 · · · 0 0     Low-rank component cannot be sparse: L = . . . . . .  . . . . . .  . . . . . .   0 0 0 0 · · · 0 0

Sparse component cannot be low-rank     ❆ x 1 x 2 · · · x n − 1 x n x 2 · · · x n − 1 x n x 1 x 2 · · · x n − 1 x n ❆ · · · x 2 x n − 1 x n         ⇒ L + S = L = . . . . . . . . . .     . . . . . . . . . . . . . . . . . . . .     · · · x 1 x 2 x n − 1 x n ❆ x 2 · · · x n − 1 x n � �� 1 x ∗

Low-rank component cannot be sparse   x 1 x 2 x 3 x 4 · · · x n − 1 x n x 1 x 2 x 3 x 4 · · · x n − 1 x n     0 0 0 0 · · · 0 0     L = 0 0 0 0 · · · 0 0     . . . . . .  . . . . . .  . . . . . .   0 0 0 0 · · · 0 0 Incoherent condition [C. and Recht (’08)]: column and row spaces not aligned with coordinate axes (cannot have small subsets of rows and/or columns that are singular)

Low-rank component cannot be sparse   ❆ x 1 x 2 x 4 · · · x n − 1 x n  ❆ ❆  x 1 x 2 x 4 · · · x n − 1     ❆ ❆ 0 0 · · · 0 0   M =   ❆ 0 0 0 · · · 0 0     . . . . . . . . . . . .   . . . . . .   ❆ ❆ 0 0 0 · · · 0 Incoherent condition [C. and Recht (’08)]: column and row spaces not aligned with coordinate axes (cannot have small subsets of rows and/or columns that are singular)

Demixing by convex programming M = L + S L unknown (rank unknown) S unknown (# of entries � = 0 , locations, magnitudes all unknown)

Demixing by convex programming M = L + S L unknown (rank unknown) S unknown (# of entries � = 0 , locations, magnitudes all unknown) Recovery via SDP � ˆ L � ∗ + λ � ˆ S � 1 minimize L + ˆ ˆ subject to S = M See also Chandrasekaran, Sanghavi, Parrilo, Willsky (’09) nuclear norm: � L � ∗ = � i σ i ( L ) (sum of sing. values) ℓ 1 norm: � S � 1 = � ij | S ij | (sum of abs. values)

❆ ❆ ❆ ❆ ❆ ❆ ❆ ❆ ❆ ❆ ❆ ❆ ❆ ❆ ❆ ❆ ❆ ❆ ❆ ❆ ❆ ❆ ❆ ❆ ❆ Exact recovery via SDP min � ˆ L � ∗ + λ � ˆ L + ˆ ˆ S � 1 s. t. S = M

❆ ❆ ❆ ❆ ❆ ❆ ❆ ❆ ❆ ❆ ❆ ❆ ❆ ❆ ❆ ❆ ❆ ❆ ❆ ❆ ❆ ❆ ❆ ❆ ❆ Exact recovery via SDP min � ˆ L � ∗ + λ � ˆ L + ˆ ˆ S � 1 s. t. S = M Theorem L is n × n of rank ( L ) ≤ ρ r n (log n ) − 2 and incoherent S is n × n , random sparsity pattern of cardinality at most ρ s n 2 Then with probability 1 − O ( n − 10 ) , SDP with λ = 1 / √ n is exact: ˆ ˆ L = L, S = S √ Same conclusion for rectangular matrices with λ = 1 / max dim

Exact recovery via SDP min � ˆ L � ∗ + λ � ˆ L + ˆ ˆ S � 1 s. t. S = M Theorem L is n × n of rank ( L ) ≤ ρ r n (log n ) − 2 and incoherent S is n × n , random sparsity pattern of cardinality at most ρ s n 2 Then with probability 1 − O ( n − 10 ) , SDP with λ = 1 / √ n is exact: ˆ ˆ L = L, S = S √ Same conclusion for rectangular matrices with λ = 1 / max dim   ❆ ❆ ❆ ❆ × × ❆ ❆ ❆ ❆  × ×    No tuning parameter!   × ❆ ❆ × ❆ ❆     Whatever the magnitudes of L and S ❆ ❆ × ❆ ❆ ×     × ❆ ❆ ❆ ❆ ❆   ❆ ❆ ❆ ❆ × ×

Phase transitions in probability of success (a) RPCA, Random Signs (b) RPCA, Coherent Signs (c) Matrix Completion L = XY T is a product of independent n × r i.i.d. N (0 , 1 /n ) matrices

Missing and corrupted ❆  ? ? ?  × × RPCA ❆ ? ? ? ? ×     ? ? ? ? × ×     ❆ ? ? ? ? � ˆ L � ∗ + λ � ˆ × min S � 1     ❆ ? ? ? ? × L ij + ˆ ˆ   s. t. S ij = L ij + S ij ( i, j ) ∈ Ω obs ❆ ? ? ? ? × Same theorem: with high prob. 1 ˆ λ = √ = ⇒ L = L ! frac. observed × max dim

Video surveillance Sequence of 200 video frames ( 144 × 172 pixels) with a static background Problem: detect any activity in the foreground … RPCA …

L + S background subtraction

L + S background subtraction From GoDec

L + S reconstruction of MR angiography L + S L S automatic and improved background suppression Joint with R. Otazo and D. Sodickson

Free-breathing MRI of the liver NUFFT Standard L + S Motion-Guided L + S 12.8 fold acceleration min � L � ∗ + λ � S � 1 s. t. A ( L + S ) = y Joint with R. Otazo and D. Sodickson

Free-breathing MRI of the liver NUFFT Standard L + S Motion-Guided L + S Temporal blurring Joint with R. Otazo and D. Sodickson

Free-breathing MRI of the kidneys NUFFT Standard L + S Motion-Guided L + S 12.8 fold acceleration min � L � ∗ + λ � S � 1 s. t. A ( L + S ) = y Joint with R. Otazo and D. Sodickson

Free-breathing MRI of the kidneys NUFFT Standard L + S Motion-Guided L + S Joint with R. Otazo and D. Sodickson

Story #3: Super-resolution Collaborator: C. Fernandez-Granda

Limits of resolution In any optical imaging system, diffraction imposes fundamental limit on resolution The physical phenomenon called diffraction is of the utmost importance in the theory of optical imaging systems (Joseph Goodman) Interested in usual bandlimited imaging systems

Pupil Airy disk Cross section

Rayleigh resolution limit Lord Rayleigh

The super-resolution problem ⇐ Fundamental problem objective data Radar Retrieve fine scale information from low-pass data Microscopy Spectroscopy Medical imaging Astronomy Geophysics ⇐ ... Equivalent description: extrapolate spectrum

Single molecule imaging Microscope receives light from fluorescent molecules Problem Resolution is much coarser than size of individual molecules (low-pass data) Can we ‘beat’ the diffraction limit and super-resolve those molecules? Higher molecule density − → faster imaging

Mathematical model Signal x = � j a j δ τ j a j ∈ C , τ j ∈ [0 , 1] Data y = F n x : n = 2 f lo + 1 low-frequency coefficients (Nyquist sampling) � 1 � e − i 2 πkt x ( d t ) = a j e − i 2 πkτ j k ∈ Z , | k | ≤ f lo y ( k ) = 0 j Resolution limit: ( λ lo / 2 is Rayleigh distance) 1 /f lo = λ lo

Mathematical model Signal x = � j a j δ τ j a j ∈ C , τ j ∈ [0 , 1] Data y = F n x : n = 2 f lo + 1 low-frequency coefficients (Nyquist sampling) � 1 � e − i 2 πkt x ( d t ) = a j e − i 2 πkτ j k ∈ Z , | k | ≤ f lo y ( k ) = 0 j Resolution limit: ( λ lo / 2 is Rayleigh distance) 1 /f lo = λ lo Question Can we resolve the signal beyond this limit?

Can you find the spikes? Low-frequency data about spike train

Recovery by minimum total variation Recovery by cvx prog. min � ˜ x � TV subject to F n ˜ x = y � | x ( d t ) | is continuous analog of discrete ℓ 1 norm � x � ℓ 1 = � � x � TV = t | x t | � � x = a j δ τ j = ⇒ � x � TV = | a j | j j Work on ℓ 1 minimization: Logan, Donoho, Stark, Tropp, Elad, C. , Tao...

Recovery by convex programming � 1 e − i 2 πkt x ( d t ) | k | ≤ f lo y ( k ) = 0

Recovery by convex programming � 1 e − i 2 πkt x ( d t ) | k | ≤ f lo y ( k ) = 0 Theorem (C. and Fernandez Granda (2012)) If spikes are separated by at least 1 . 86 /f lo := 1 . 86 λ lo then min TV solution is exact! (Current state of the art 1 . 4 λ lo )

Recovery by convex programming � 1 e − i 2 πkt x ( d t ) | k | ≤ f lo y ( k ) = 0 Theorem (C. and Fernandez Granda (2012)) If spikes are separated by at least 1 . 86 /f lo := 1 . 86 λ lo then min TV solution is exact! (Current state of the art 1 . 4 λ lo ) Infinite precision! (Whatever the amplitudes)

Recovery by convex programming � 1 e − i 2 πkt x ( d t ) | k | ≤ f lo y ( k ) = 0 Theorem (C. and Fernandez Granda (2012)) If spikes are separated by at least 1 . 86 /f lo := 1 . 86 λ lo then min TV solution is exact! (Current state of the art 1 . 4 λ lo ) Infinite precision! (Whatever the amplitudes) Cannot go below λ lo

Recovery by convex programming � 1 e − i 2 πkt x ( d t ) | k | ≤ f lo y ( k ) = 0 Theorem (C. and Fernandez Granda (2012)) If spikes are separated by at least 1 . 86 /f lo := 1 . 86 λ lo then min TV solution is exact! (Current state of the art 1 . 4 λ lo ) Infinite precision! (Whatever the amplitudes) Cannot go below λ lo Can recover (2 λ lo ) − 1 = f lo / 2 = n/ 4 spikes from n low-freq. samples

Recovery by convex programming � 1 e − i 2 πkt x ( d t ) | k | ≤ f lo y ( k ) = 0 Theorem (C. and Fernandez Granda (2012)) If spikes are separated by at least 1 . 86 /f lo := 1 . 86 λ lo then min TV solution is exact! (Current state of the art 1 . 4 λ lo ) Infinite precision! (Whatever the amplitudes) Cannot go below λ lo Can recover (2 λ lo ) − 1 = f lo / 2 = n/ 4 spikes from n low-freq. samples Essentially same result in higher dimensions

Formulation as a finite-dimensional problem Dual problem Primal problem max Re � y, c � s. t. �F ∗ n c � ∞ ≤ 1 min � x � TV s. t. F n x = y Finite-dimensional variable c Infinite-dimensional variable x Infinitely many constraints Finitely many constraints � ( F ∗ c k e i 2 πkt n c )( t ) = | k |≤ f lo

Formulation as a finite-dimensional problem Dual problem Primal problem max Re � y, c � s. t. �F ∗ n c � ∞ ≤ 1 min � x � TV s. t. F n x = y Finite-dimensional variable c Infinite-dimensional variable x Infinitely many constraints Finitely many constraints � ( F ∗ c k e i 2 πkt n c )( t ) = | k |≤ f lo Semidefinite representability | ( F ∗ n c )( t ) | ≤ 1 for all t ∈ [0 , 1] equivalent to (1) there is Q Hermitian s. t. � � Q c � 0 c ∗ 1 (2) Tr ( Q ) = 1 (3) sums along superdiagonals vanish: � n − j i =1 Q i,i + j = 0 for 1 ≤ j ≤ n − 1

Dual solution c : coeffs. of low-pass trig. poly. � k c k e i 2 πkt interpolating sign of primal solution

Super-resolution via semidefinite programming

Super-resolution via semidefinite programming 1. Solve semidefinite program to obtain dual solution

Super-resolution via semidefinite programming 2. Locate points at which corresponding polynomial has unit magnitude

Super-resolution via semidefinite programming Signal Estimate 3. Estimate amplitudes via least squares

Support-location accuracy f c 25 50 75 100 6 . 66 10 − 9 1 . 70 10 − 9 5 . 58 10 − 10 2 . 96 10 − 10 Average error 1 . 83 10 − 7 8 . 14 10 − 8 2 . 55 10 − 8 2 . 31 10 − 8 Maximum error For each f c , 100 random signals with | T | = f c / 4 and ∆( T ) ≥ 2 / f c

Example Minimum separation : 1 . 5 λ c

Example SNR 20 dB Noisy Noiseless

Beyond Compressed Sensing: The Effectiveness of Convex Programming - PowerPoint PPT Presentation

Beyond Compressed Sensing: The Effectiveness of Convex Programming in the Information and Physical Sciences Emmanuel Cand` es EUSIPCO 2015, Nice, September 2015 Three stories about signal recovery from missing information Today I want to

Decoding in Compressed Sensing Ronald DeVore USC, 2008 p. 1/33 Discrete Compressed Sensing R

Convex Hell 362 dnc CS 16: Convex Hull Whoops, I mean... Convex Hull Whats a Convex Hull?

Compressed Membership for NFA (DFA) with Compressed Labels is in NP (P) Artur Je University of

Foundations of Compressed Sensing Mike Davies Edinburgh Compressed Sensing research group (E-CoS)

Compressed Sensing: Challenges and Emerging Topics Mike Davies Edinburgh Compressed Sensing

Deep Compressed Sensing Yan Wu, Mihaela Rosca, Tim Lillicrap Compressed Sensing A Brief Review

Infinite Dimensional Compressed Sensing Anders C. Hansen, University of Cambridge Chemnitz,

Convex hull 1 - 1 Convex hull 1 - 2 Convex hull 1 - 3 Convex hull Definition, extremal

CS133 Computational Geometry Convex Hull 1 Convex Hull Given a set of n points, find the

constrained convex optimization virgil pavlu 1 convex set a set X in a vector space is convex if

Introduction to Compressed Sensing Gitta Kutyniok (Institut f ur Mathematik, Technische

Compressed Sensing and Bayesian Experimental Design or Optimal Sensing and Reconstruction of N -

CS675: Convex and Combinatorial Optimization Spring 2018 Convex Sets Instructor: Shaddin Dughmi

Convex hull: basic facts Convex hull: basic facts CG Lecture 1 CG Lecture 1 Problem : give a set

Convex hulls of spheres and convex hulls of convex polytopes lying on parallel hyperplanes

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Functions Instructor: Shaddin

GSAS-II: WHAT DOES IT DO? WHAT IS NEW? WHAT IS COMING AND WHAT IS NOT.

Recovery of Compactly Supported Functions from Spectrogram Measurements via Lifting Mark Iwen

Modern Optimization Meets Physics: Recent Progress on Phase Retrieval Yuxin Chen (on behalf of

Preparation and characterisation Julia Lyubina Institute for Metallic Materials, IFW Dresden,

http://t2k-experiment.org/ George Paget Thomson Clinton Davisson and Lester Germer 1892-1975

Diamond Light Source a New Light for Science Richard P. Walker, Technical Director 1.

The Coherent X-ray Imaging (CXI) Instrument at LCLS Sbastien Boutet SLAC National Accelerator

SELECTION Deterministic Stochastic Proportionate selection: Roulette Wheel Selection