Mathematics of Sparsity (and a Few Other Things) Emmanuel Cand` es - PowerPoint PPT Presentation

Early use of ` 1 norm Rich history in applied science Logan (50’s) Claerbout (70’s) Santosa and Symes (80’s) Donoho (90’s) Osher and Rudin (90’s) Tibshirani (90’s) Many since then Ben Logan (1927–) Mathematician Bluegrass music fiddler

A Taste of Analysis: Geometry and Probability

Geometry C = { h : k x + th k  k x k for some t > 0 } cone of descent Exact recovery if C \ null ( A ) = { 0 }

Geometry

Gaussian models Entries of A iid N (0 , 1) � ! row vectors a 1 , . . . , a m are iid N (0 , I ) Important consequence: null ( A ) uniformly distributed P ( C \ null ( A ) = { 0 } ) volume calculation

Volume calculations: geometric functional analysis

Volume of a cone C o = { y : h y, z i  0 for all z 2 C} Polar cone polar cone 0 C g C descent cone Statistical dimension z 2 C o k g � z k 2 ` 2 = E g k ⇡ C ( g ) k 2 � ( C ) := E g min g ⇠ N (0 , I ) ` 2

Volume of a cone C o = { y : h y, z i  0 for all z 2 C} Polar cone polar cone polar cone 0 C g g C descent cone descent cone Statistical dimension z 2 C o k g � z k 2 ` 2 = E g k ⇡ C ( g ) k 2 � ( C ) := E g min g ⇠ N (0 , I ) ` 2

Gordon’s escape lemma Theorem (Gordon ’88) Convex cone K ⇢ R n and m ⇥ n Gaussian matrix A . With prob. at least 1 � e � t 2 / 2 p � ( K ) + t ) 2 + 1 null ( A ) \ K = { 0 } m � ( = ) |{z} codim ( null ( A ))

Gordon’s escape lemma Theorem (Gordon ’88) Convex cone K ⇢ R n and m ⇥ n Gaussian matrix A . With prob. at least 1 � e � t 2 / 2 p � ( K ) + t ) 2 + 1 null ( A ) \ K = { 0 } m � ( = ) |{z} codim ( null ( A )) Implication: exact recovery if m � � ( C ) (roughly) [Rudelson & Vershynin (’08)]

Gordon’s escape lemma Theorem (Gordon ’88) Convex cone K ⇢ R n and m ⇥ n Gaussian matrix A . With prob. at least 1 � e � t 2 / 2 p � ( K ) + t ) 2 + 1 null ( A ) \ K = { 0 } m � ( = ) |{z} codim ( null ( A )) Implication: exact recovery if m � � ( C ) (roughly) [Rudelson & Vershynin (’08)] Gordon’s lemma originally stated with Gaussian width w ( K ) := E g z 2 K \ S n − 1 h g, z i sup � ( K ) � 1  w 2 ( K )  � ( K )

Statistical dimension of ` 1 descent cone C o is cone of subdi ff erential polar cone 0 C C o = { t u : t > 0 and u 2 @ k x k } g u 2 @ k x k i ff 8 h k x + h k � k x k + h u, h i C descent cone

Statistical dimension of ` 1 descent cone x ? = ( ⇤ , ⇤ , . . . , ⇤ , 0 , 0 . . . , 0 ) polar cone 0 | {z } | {z } C s times n � s times g ( u i = sgn ( x ? i ) 1  i  s u 2 @ k x ? k ` 1 ( ) | u i |  1 i > s C descent cone

Statistical dimension of ` 1 descent cone x ? = ( ⇤ , ⇤ , . . . , ⇤ , 0 , 0 . . . , 0 ) polar cone 0 | {z } | {z } C s times n � s times g ( u i = sgn ( x ? i ) 1  i  s u 2 @ k x ? k ` 1 ( ) | u i |  1 i > s C descent cone 8 9 < = X X ( g i � tu i ) 2 + z 2 C o k g � z k 2 ( g i � tu i ) 2 E g min = E inf ` 2 t � 0 u 2 @ k x ? k ` 1 : ; i  s i>s | {z } � ( C )

Statistical dimension of ` 1 descent cone x ? = ( ⇤ , ⇤ , . . . , ⇤ , 0 , 0 . . . , 0 ) polar cone 0 | {z } | {z } C s times n � s times g ( u i = sgn ( x ? i ) 1  i  s u 2 @ k x ? k ` 1 ( ) | u i |  1 i > s C descent cone 8 9 < = X X ( g i ± t ) 2 + z 2 C o k g � z k 2 ( | g i | � t ) 2 E g min = E inf ` 2 + t � 0 : ; i  s i>s | {z } � ( C )

Statistical dimension of ` 1 descent cone x ? = ( ⇤ , ⇤ , . . . , ⇤ , 0 , 0 . . . , 0 ) polar cone 0 | {z } | {z } C s times n � s times g ( u i = sgn ( x ? i ) 1  i  s u 2 @ k x ? k ` 1 ( ) | u i |  1 i > s C descent cone � z 2 C o k g � z k 2 s · (1 + t 2 ) + ( n � s ) · E ( | g 1 | � t ) 2 E g min  inf + ` 2 t � 0 | {z } � ( C )

Statistical dimension of ` 1 descent cone x ? = ( ⇤ , ⇤ , . . . , ⇤ , 0 , 0 . . . , 0 ) polar cone 0 | {z } | {z } C s times n � s times g ( u i = sgn ( x ? i ) 1  i  s u 2 @ k x ? k ` 1 ( ) | u i |  1 i > s C descent cone � z 2 C o k g � z k 2 s · (1 + t 2 ) + ( n � s ) · E ( | g 1 | � t ) 2 E g min  inf + ` 2 t � 0 | {z } � ( C )  2 s log( n/s ) + 2 s | {z } su ffi cient # of equations Stojnic (’09); Chandrasekaharan, Recht, Parrilo, Willsky (’12)

Phase transitions for Gaussian maps Theorem (Amelunxen, Lotz, McCoy and Tropp ’13) C is descent cone (norm k · k ) at fixed x ? 2 R n . Then for a fixed " 2 (0 , 1) p n m  � ( C ) � a " = ) cvx. prog. succeeds with prob.  " p n m � � ( C ) + a " = ) cvx. prog. succeeds with prob. � 1 � " p a " = 8 log(4 / " )

Phase transitions for Gaussian maps Theorem (Amelunxen, Lotz, McCoy and Tropp ’13) C is descent cone (norm k · k ) at fixed x ? 2 R n . Then for a fixed " 2 (0 , 1) p n m  � ( C ) � a " = ) cvx. prog. succeeds with prob.  " p n m � � ( C ) + a " = ) cvx. prog. succeeds with prob. � 1 � " p a " = 8 log(4 / " ) 100 900 75 600 50 300 25 0 0 0 25 50 75 100 0 10 20 30

Phase transitions for Gaussian maps Courtesy of Amelunxen, Lotz, McCoy and Tropp 100 900 75 600 50 300 25 0 0 0 25 50 75 100 0 10 20 30 Asymptotic phase transition for ` 1 recovery: Donoho (’06), Donoho & Tanner (’09)

Discrete geometry approach (Donoho and Tanner ’06, ’09) Cross-polytope P = { x 2 R n : k x k ` 1  1 } Projected polytope A P e 3 e 2 Ae 3 Ae 2 e 1 Range of A Ae 1 s -sparse x 2 ( s � 1) -dim face F of P ` 1 succeeds ( ) face F is conserved ( AF : face of projected polytope)

Discrete geometry approach (Donoho and Tanner ’06, ’09) Cross-polytope P = { x 2 R n : k x k ` 1  1 } Projected polytope A P e 3 e 2 Ae 3 Ae 2 e 1 Range of A Ae 1 s -sparse x 2 ( s � 1) -dim face F of P ` 1 succeeds ( ) face F is conserved ( AF : face of projected polytope) Integral geometry of convex sets: McMullen (’75), Gr¨ unbaum (’68) Polytope angle calculations: Vershik and Sporishev (’86, ’92), A ff entranger and Schneider (’92)

Non-Gaussian models MRI Collaborative filtering Under incoherence, cvx. prog. succeeds if m & df · log n |{z} # eqns

Dual certificates min k x k s.t. y = Ax row(A) polar cone x solution i ff there exists v ? null ( A ) and v 2 C o , v 2 @ k x k null(A) descent cone

Dual certificates min k x k s.t. y = Ax null(A) polar cone row(A) x solution i ff there exists v ? null ( A ) and v 2 C o , v 2 @ k x k descent cone

Sparse recovery ( v i = sgn ( x i ) x i 6 = 0 dual v 2 row ( A ) = span ( a 1 , . . . , a m ) and certificate | v i |  1 x i = 0 ! a k ( t ) = e i 2 ⇡! k t , ! k random Example : Fourier sampling � +1 X c k e i 2 ⇡! k t v ( t ) = and k | {z } v 2 row ( A ) -1 sgn ( x ) ( x 6 = 0)

Dual certificate construction ( ( Pv = sgn ( x ) v i x i 6 = 0 v 2 row ( A ) and ( Pv ) i = k ( I � P ) v k ` ∞  1 0 x i = 0

Dual certificate construction ( ( Pv = sgn ( x ) v i x i 6 = 0 v 2 row ( A ) and ( Pv ) i = k ( I � P ) v k ` ∞  1 0 x i = 0 Candidate certificate 9 minimize k v k ` 2 = v = A ⇤ A ( PA ⇤ AP ) � 1 sgn ( x ) subject to Pv = sgn ( x ) ; v 2 row ( A )

Dual certificate construction ( ( Pv = sgn ( x ) v i x i 6 = 0 v 2 row ( A ) and ( Pv ) i = k ( I � P ) v k ` ∞  1 0 x i = 0 Candidate certificate 9 minimize k v k ` 2 = v = A ⇤ A ( PA ⇤ AP ) � 1 sgn ( x ) subject to Pv = sgn ( x ) ; v 2 row ( A ) sgn ( x ) ( x 6 = 0)

Dual certificate construction ( ( Pv = sgn ( x ) v i x i 6 = 0 v 2 row ( A ) and ( Pv ) i = k ( I � P ) v k ` ∞  1 0 x i = 0 Candidate certificate 9 minimize k v k ` 2 = v = A ⇤ A ( PA ⇤ AP ) � 1 sgn ( x ) subject to Pv = sgn ( x ) ; v 2 row ( A ) Analysis via combinatorial methods sparse signal recovery (C. Romberg and Tao, ’04) matrix completion (C. and Tao ’09) Analysis for matrix completion via tools from geometric functional analysis (C. and Recht, ’08) Gives accurate answers in Gaussian case: m � 2 s log n (C. and Recht, ’12) Widely used since then

Some Immediate and (Far) Less Immediate Applications

Impact on MR pediatrics Lustig (UCB), Pauly, Vasanawala (Stanford) 6 year old 8X acceleration 16 second scan 0.875 mm in-plane 1.6 slice thickness 32 channels

1 year old female with liver lesions: 8X acceleration Lustig (UCB), Pauly, Vasanawala (Stanford) Parallel imaging (PI) Compressed sensing + PI Lesions are barely seen with linear reconstruction

6 year old male abdomen: 8X acceleration Lustig (UCB), Pauly, Vasanawala (Stanford) Parallel imaging (PI) Compressed sensing + PI Fine structures (arrows) are buried in noise (artifacts + noise amplification) and recovered by CS ( ` 1 + wavelets)

6 year old male abdomen: 8X acceleration Lustig (UCB), Pauly, Vasanawala (Stanford) Parallel imaging (PI) Compressed sensing + PI Fine structures (arrows) are buried in noise and recovered by CS

Missing phase problem Eyes and detectors see intensity But light is a wave ! has intensity and phase Phase retrieval x 2 C n find y = | Ax | 2 ( or y k = | h a k , x i | 2 , k = 1 , . . . , m ) subject to

Origin in X-ray crystallography 10 Nobel Prizes in X-ray crystallography, and counting...

Another look at phase retrieval With Eldar, Strohmer and Voroninski | h a k , x i | 2 = y k find x subject to k = 1 , . . . , m Solving quadratic equations is NP hard in general ! ad-hoc solutions

Another look at phase retrieval With Eldar, Strohmer and Voroninski | h a k , x i | 2 = y k find x subject to k = 1 , . . . , m Solving quadratic equations is NP hard in general ! ad-hoc solutions | h a k , x i | 2 = Tr ( a k a ⇤ Lifting : X = xx ⇤ k xx ⇤ ) := Tr ( a k a ⇤ k X ) Phase retrieval problem find X such that A ( X ) = y X ⌫ 0 , rank ( X ) = 1

Another look at phase retrieval With Eldar, Strohmer and Voroninski | h a k , x i | 2 = y k find x subject to k = 1 , . . . , m Solving quadratic equations is NP hard in general ! ad-hoc solutions | h a k , x i | 2 = Tr ( a k a ⇤ Lifting : X = xx ⇤ k xx ⇤ ) := Tr ( a k a ⇤ k X ) Phase retrieval problem PhaseLift find X minimize Tr ( X ) such that A ( X ) = y subject to A ( X ) = y X ⌫ 0 , rank ( X ) = 1 X ⌫ 0

Another look at phase retrieval With Eldar, Strohmer and Voroninski | h a k , x i | 2 = y k find x subject to k = 1 , . . . , m Solving quadratic equations is NP hard in general ! ad-hoc solutions | h a k , x i | 2 = Tr ( a k a ⇤ Lifting : X = xx ⇤ k xx ⇤ ) := Tr ( a k a ⇤ k X ) Phase retrieval problem PhaseLift find X minimize Tr ( X ) such that A ( X ) = y subject to A ( X ) = y X ⌫ 0 , rank ( X ) = 1 X ⌫ 0 Other convex relaxations of quadratically constrained QP’s: Shor (87); Goemans and Williamson (95) [MAX-CUT]

A surprise Phase retrieval PhaseLift find x min Tr ( X ) y k = | h a k , x i | 2 A ( X ) = y, X ⌫ 0 s. t. s. t.

A surprise Phase retrieval PhaseLift find x min Tr ( X ) y k = | h a k , x i | 2 A ( X ) = y, X ⌫ 0 s. t. s. t. Theorem (C. and Li (’12); C., Strohmer and Voroninski (’11)) a k independently and uniformly sampled on unit sphere m & n Then with prob. 1 � O ( e � � m ) , only feasible point is xx ⇤ X ⌫ 0 } = { xx ⇤ } ! { X : A ( X ) = y and Proof via construction of dual certificates

A separation problem Cand` es, Li Wright, Ma (’09) Chandrasekaran, Sanghavi, Parrilo, Willsky (’09) 2 3 ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ Y = L + S ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ 6 7 6 7 Y : data matrix (observed) ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ 6 7 6 7 ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ L : low-rank (unobserved) 6 7 6 7 ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ 4 5 S : sparse (unobserved) ⇥ ⇥ ⇥ ⇥ ⇥ ⇥

A separation problem Cand` es, Li Wright, Ma (’09) Chandrasekaran, Sanghavi, Parrilo, Willsky (’09) 2 3 ⇥ ⇥ Y = L + S 6 ⇥ ⇥ 7 6 7 Y : data matrix (observed) 6 7 ⇥ ⇥ 6 7 L : low-rank (unobserved) 6 7 ⇥ ⇥ 6 7 6 7 S : sparse (unobserved) ⇥ 4 5 ⇥ ⇥

A separation problem Cand` es, Li Wright, Ma (’09) Chandrasekaran, Sanghavi, Parrilo, Willsky (’09) 2 3 ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ Y = L + S ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ 6 7 6 7 Y : data matrix (observed) ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ 6 7 6 7 ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ L : low-rank (unobserved) 6 7 6 7 ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ 4 5 S : sparse (unobserved) ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ Can we recover L and S accurately? Looks impossible Recover low-dimensional structure from corrupted data: approach to robust principal component analysis (PCA)

Mathematics of Sparsity (and a Few Other Things) Emmanuel Cand` es - PowerPoint PPT Presentation

Mathematics of Sparsity (and a Few Other Things) Emmanuel Cand` es International Congress of Mathematicians (ICM 2014), Seoul, August 2014 Some Motivation Magnetic Resonance Imaging (MRI) MR scanner MR image Image from K. Pauly, G. Gold,

Sparsity, Randomness and Compressed Sensing Petros Boufounos Mitsubishi Electric Research Labs

Introduction to Sparsity in Modeling and Learning Introduction to Sparsity in Modeling and

Sparsity and image processing Aurlie Boisbunon INRIA-SAM, AYIN March 26, 2014 Why sparsity?

A Few Quick things A video recording of this live webinar will be sent to you afterwards. A

Sparsity in Information Theory and Biology Olgica Milenkovic ECE Department, UIUC Joint work

Sparsity and optimality of splines: Deterministic vs. statistical justification Michael Unser

Starting at 1pm Central A Few Quick things A video recording of this live webinar will be sent

Starting at 1pm central A Few Quick things A video recording of this live webinar will be sent

RAMSEY CLASSES SPARSITY AND MODELS FOR FINITE - NESIETPIIL JAROSLAV UNIVERSITY CHARLES

Structured sparsity and convex optimization Francis Bach INRIA - Ecole Normale Sup erieure,

Sparsity-aware sampling theorems and applications Rachel Ward University of Texas at Austin

Commonsense Explanations Zipfs Law: A Brief . . . of Sparsity, Zipf Law, and Main Idea Behind

Computing sparsity stuff in real world graphs Marcin Pilipczuk a lot of slides by Wojciech Nadara

Things you can do Things you can do Things you can do Everything you need to know

WWW.TOTW.ORG By Kenneth M Hoeck Finally, brethren, whatsoever things are true, whatsoever things

Blind Image Deconvolution Need for Theoretical . . . Based on Sparsity: Need for Improvement

Fixed Point Algorithms for Phase Retrieval and Ptychography Albert Fannjiang University of

Slide 1 / 68 Slide 2 / 68 The Cathode Rays experiment is associated 2 The electron charge

Wireless Fidelity with bwfm(4) Patrick Wildt September 22, 2019 Patrick Wildt Wireless Fidelity

Welcome to AP Computer Science! Please sit where you want Mrs. Donaldson, Room I-4 Go Bears!

Synchrotron radiation Marcin Sikora Academic Centre for Materials and Nanotechnology, AGH-UST,

some results Galactic Gamma-Ray sources: Microquasars and new transients M. Tavani on behalf

EXPLORING THE SOLUTION ZOO OF A SEMI-ANALYTICAL MHD MODEL FOR SELF-SIMILAR JETS C HIARA C

Principles & implementation of automatic exposure control systems in CT Maria Lewis ImPACT

Mathematics of Sparsity (and a Few Other Things) Emmanuel Cand` es - PowerPoint PPT Presentation

Mathematics of Sparsity (and a Few Other Things) Emmanuel Cand` es International Congress of Mathematicians (ICM 2014), Seoul, August 2014 Some Motivation Magnetic Resonance Imaging (MRI) MR scanner MR image Image from K. Pauly, G. Gold,

Sparsity, Randomness and Compressed Sensing Petros Boufounos Mitsubishi Electric Research Labs

Introduction to Sparsity in Modeling and Learning Introduction to Sparsity in Modeling and

Sparsity and image processing Aurlie Boisbunon INRIA-SAM, AYIN March 26, 2014 Why sparsity?

A Few Quick things A video recording of this live webinar will be sent to you afterwards. A

Sparsity in Information Theory and Biology Olgica Milenkovic ECE Department, UIUC Joint work

Sparsity and optimality of splines: Deterministic vs. statistical justification Michael Unser

Starting at 1pm Central A Few Quick things A video recording of this live webinar will be sent

Starting at 1pm central A Few Quick things A video recording of this live webinar will be sent

RAMSEY CLASSES SPARSITY AND MODELS FOR FINITE - NESIETPIIL JAROSLAV UNIVERSITY CHARLES

Structured sparsity and convex optimization Francis Bach INRIA - Ecole Normale Sup erieure,

Sparsity-aware sampling theorems and applications Rachel Ward University of Texas at Austin

Commonsense Explanations Zipfs Law: A Brief . . . of Sparsity, Zipf Law, and Main Idea Behind

Computing sparsity stuff in real world graphs Marcin Pilipczuk a lot of slides by Wojciech Nadara

Things you can do Things you can do Things you can do Everything you need to know

WWW.TOTW.ORG By Kenneth M Hoeck Finally, brethren, whatsoever things are true, whatsoever things

Blind Image Deconvolution Need for Theoretical . . . Based on Sparsity: Need for Improvement

Fixed Point Algorithms for Phase Retrieval and Ptychography Albert Fannjiang University of

Slide 1 / 68 Slide 2 / 68 The Cathode Rays experiment is associated 2 The electron charge

Wireless Fidelity with bwfm(4) Patrick Wildt September 22, 2019 Patrick Wildt Wireless Fidelity

Welcome to AP Computer Science! Please sit where you want Mrs. Donaldson, Room I-4 Go Bears!

Synchrotron radiation Marcin Sikora Academic Centre for Materials and Nanotechnology, AGH-UST,

some results Galactic Gamma-Ray sources: Microquasars and new transients M. Tavani on behalf

EXPLORING THE SOLUTION ZOO OF A SEMI-ANALYTICAL MHD MODEL FOR SELF-SIMILAR JETS C HIARA C

Principles &amp; implementation of automatic exposure control systems in CT Maria Lewis ImPACT

Principles & implementation of automatic exposure control systems in CT Maria Lewis ImPACT