Mathematics of Sparsity (and a Few Other Things) Emmanuel Cand` es - - PowerPoint PPT Presentation

mathematics of sparsity and a few other things
SMART_READER_LITE
LIVE PREVIEW

Mathematics of Sparsity (and a Few Other Things) Emmanuel Cand` es - - PowerPoint PPT Presentation

Mathematics of Sparsity (and a Few Other Things) Emmanuel Cand` es International Congress of Mathematicians (ICM 2014), Seoul, August 2014 Some Motivation Magnetic Resonance Imaging (MRI) MR scanner MR image Image from K. Pauly, G. Gold,


slide-1
SLIDE 1

Mathematics of Sparsity (and a Few Other Things)

Emmanuel Cand` es International Congress of Mathematicians (ICM 2014), Seoul, August 2014

slide-2
SLIDE 2

Some Motivation

slide-3
SLIDE 3

Magnetic Resonance Imaging (MRI)

MR scanner MR image

Image from K. Pauly, G. Gold, RAD220

slide-4
SLIDE 4

What an MRI machine sees

y(k1, k2) = ZZ f(x1, x2)ei2⇡(k1x1+k2x2) dx1dx2

slide-5
SLIDE 5

How do we form an image?

f(x1, x2) ⇡ X X y(k1, k2)ei2⇡(k1x1+k2x2)

slide-6
SLIDE 6

A surprising experiment

Fourier transform Highly subsampled

C., Romberg and Tao (’04)

slide-7
SLIDE 7

A surprising experiment

Fourier transform Highly subsampled Fourier transform

C., Romberg and Tao (’04)

slide-8
SLIDE 8

A surprising experiment

Fourier transform Highly subsampled Fourier transform highly subsampled

C., Romberg and Tao (’04)

slide-9
SLIDE 9

A surprising experiment

Fourier transform Highly subsampled Fourier transform highly subsampled classical reconstruction

C., Romberg and Tao (’04)

slide-10
SLIDE 10

A surprising experiment

Fourier transform Highly subsampled Fourier transform highly subsampled classical reconstruction compressed sensing reconstruction

C., Romberg and Tao (’04)

slide-11
SLIDE 11

A surprising experiment

Fourier transform Highly subsampled Fourier transform highly subsampled Algorithm: min X

x1,x2

||rf(x1, x2)|| subj. to data constraints classical reconstruction compressed sensing reconstruction

C., Romberg and Tao (’04)

slide-12
SLIDE 12

Other data recovery problems: collaborative filtering

Netflix Challenge Predict unseen ratings

slide-13
SLIDE 13

Another surprising experiment

Ground truth 50 ⇥ 50 matrix

slide-14
SLIDE 14

Another surprising experiment

Observed samples

slide-15
SLIDE 15

Another surprising experiment

Observed samples Estimate via nuclear norm min

slide-16
SLIDE 16

Another surprising experiment

Ground truth Estimate via nuclear norm min

slide-17
SLIDE 17

Common theme

Underdetermined system of linear equations about x 2 Rn, Cn yk = hak, xi, k = 1, . . . , m, m ⌧ n Convex programming returns the correct solution

slide-18
SLIDE 18

What’s Behind This Phenomenon?

slide-19
SLIDE 19

Ingredients for success

= (1) Structured solutions (2) Recovery via convex programming (3) Incoherence

slide-20
SLIDE 20

Structured solutions

= How can we possibly solve? Need some structure

slide-21
SLIDE 21

Structured solutions

= How can we possibly solve? Need some structure

Sparsity

s-sparse vector x 2 Cn has at most s nonzero entries ! at most s degrees of freedom (df)

slide-22
SLIDE 22

Structured solutions

= How can we possibly solve? Need some structure

Sparsity

s-sparse vector x 2 Cn has at most s nonzero entries ! at most s degrees of freedom (df)

Low rank

rank-r matrix X 2 Rn1⇥n2 ! r(n1 + n2 r) degrees of freedom

slide-23
SLIDE 23

Structured solutions

= How can we possibly solve? Need some structure

Sparsity

s-sparse vector x 2 Cn has at most s nonzero entries ! at most s degrees of freedom (df)

Low rank

rank-r matrix X 2 Rn1⇥n2 ! r(n1 + n2 r) degrees of freedom If df < # unknowns, can we solve?

slide-24
SLIDE 24

First impulse for finding structured solutions

Find sparsest solution minimize |{i : xi 6= 0}| subject to y = Ax Find lowest rank solution minimize rank(X) subject to y = A(X)

slide-25
SLIDE 25

First impulse for finding structured solutions

Find sparsest solution minimize |{i : xi 6= 0}| subject to y = Ax Find lowest rank solution minimize rank(X) subject to y = A(X) NP-hard: best algorithm takes at least exponential time in problem size

slide-26
SLIDE 26

Recovery by convex programming

minimize kxk subject to y = Ax

slide-27
SLIDE 27

Recovery by convex programming

minimize kxk subject to y = Ax `1 norm for sparse recovery problem kxk`1 = X

i

|xi| Nuclear or Schatten-1 norm for low-rank recovery problem kXkS1 = X

i

i(X) i(X) = p i(X⇤X)

slide-28
SLIDE 28

Recovery by convex programming

minimize kxk subject to y = Ax `1 norm for sparse recovery problem kxk`1 = X

i

|xi| Nuclear or Schatten-1 norm for low-rank recovery problem kXkS1 = X

i

i(X) i(X) = p i(X⇤X) Min norm problem is a convex program and computationally tractable

slide-29
SLIDE 29

Incoherence – sparse recovery

2 6 6 4 3 7 7 5 |{z}

y

= 2 6 6 4 1 1 1 1 3 7 7 5 | {z }

A

2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 ∗ ∗ 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 |{z}

x

slide-30
SLIDE 30

Incoherence – sparse recovery

2 6 6 4 3 7 7 5 |{z}

y

= 2 6 6 4 1 1 1 1 3 7 7 5 | {z }

A

2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 ∗ ∗ 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 |{z}

x

* *

slide-31
SLIDE 31

Incoherence – sparse recovery

2 6 6 4 3 7 7 5 |{z}

y

= 2 6 6 4 1 1 1 1 3 7 7 5 | {z }

A

2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 ∗ ∗ 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 |{z}

x

* *

Rows of A (sampling vectors ak) cannot be sparse

slide-32
SLIDE 32

Incoherence – sparse recovery

2 6 6 4 3 7 7 5 |{z}

y

= 2 6 6 4 1 1 1 1 3 7 7 5 | {z }

A

2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 ∗ ∗ 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 |{z}

x

* *

Rows of A (sampling vectors ak) cannot be sparse solution x is local rows of A are global

slide-33
SLIDE 33

Formal definition

Stochastic description yk = hak, xi a1, a2, . . . , am

i.i.d.

⇠ F

slide-34
SLIDE 34

Formal definition

Stochastic description yk = hak, xi a1, a2, . . . , am

i.i.d.

⇠ F (1) Rows are diverse: E aa⇤ = Σ not singular (will take Σ = I = ) E kak2

`2 = n)

slide-35
SLIDE 35

Formal definition

Stochastic description yk = hak, xi a1, a2, . . . , am

i.i.d.

⇠ F (1) Rows are diverse: E aa⇤ = Σ not singular (will take Σ = I = ) E kak2

`2 = n)

(2) Rows are not sparse if incoherence parameter µ(F) is small: maxi |ha, eii|2  µ(F) a ⇠ F ei (std. basis)

slide-36
SLIDE 36

Examples

maxi |ha, eii|2  µ(F) a ⇠ F µ(F) 1

slide-37
SLIDE 37

Examples

maxi |ha, eii|2  µ(F) a ⇠ F µ(F) 1 Random frequency sampling Incoherent: µ(F) = 1

slide-38
SLIDE 38

Examples

maxi |ha, eii|2  µ(F) a ⇠ F µ(F) 1 Random frequency sampling Incoherent: µ(F) = 1 Random ‘time’ sampling Coherent: µ(F) = n

slide-39
SLIDE 39

Sparse signal recovery

minimize kxk`1 subject to y = Ax

Theorem (C. and Plan, ’10)

x? 2 Cn is arbitrary s-sparse vector Data vector y = Ax? 2 Cm with m & µ(F) · df · log n df = s Then with high prob., min `1 solution is unique and exact! Rows diverse and not sparse

  • !

& s log n samples suffice

slide-40
SLIDE 40

Sparse signal recovery

minimize kxk`1 subject to y = Ax

Theorem (C. and Plan, ’10)

x? 2 Cn is arbitrary s-sparse vector Data vector y = Ax? 2 Cm with m & µ(F) · df · log n df = s Then with high prob., min `1 solution is unique and exact! Rows diverse and not sparse

  • !

& s log n samples suffice Tight (cannot do with much fewer samples by any method) C., Romberg and Tao (’04): exact recovery from & s log n randomly selected Fourier samples

slide-41
SLIDE 41

Incoherence – low-rank recovery

rank-2 matrix

slide-42
SLIDE 42

Incoherence – low-rank recovery

rank-2 matrix missing data

slide-43
SLIDE 43

Incoherence – low-rank recovery

rank-2 matrix missing data Cannot have singular rows and/or cols row and/or col. space aligned with coord. axes

slide-44
SLIDE 44

Formal definition

X 2 Rn1⇥n2 of rank r: µ(X) 1 is max correlation with coord. axes max

1in1

k⇡col(X)eik2

`2/(r/n1)  µ(X)

max

1jn2

k⇡row(X)ejk2

`2/(r/n2)  µ(X)

Matrices with col and row spaces drawn uniformly at random have low coherence

slide-45
SLIDE 45

Low-rank matrix completion

minimize kXkS1 subject to y = A(X)

Theorem (C. and Recht ’08, C. and Tao ’09, Gross ’09

due to C. and Li ’13, and Chen ’13 as stated)

X?: arbitrary n1 ⇥ n2 matrix of rank r y: m revealed entries at randomly selected locations m & µ(X) · df · log2(n1 + n2) df = r(n1 + n2 r) Then with high prob., min nuclear norm solution is unique and exact! Can recover most low-rank matrices from just a few entries

slide-46
SLIDE 46

Low-rank matrix completion

minimize kXkS1 subject to y = A(X)

Theorem (C. and Recht ’08, C. and Tao ’09, Gross ’09

due to C. and Li ’13, and Chen ’13 as stated)

X?: arbitrary n1 ⇥ n2 matrix of rank r y: m revealed entries at randomly selected locations m & µ(X) · df · log2(n1 + n2) df = r(n1 + n2 r) Then with high prob., min nuclear norm solution is unique and exact! Can recover most low-rank matrices from just a few entries Tight (up to a log factor) Extensions to other linear functionals (Gross, ’09) Low-rank matrix recovery under Gaussian maps (Recht, Parrilo, Fazel, ’07)

slide-47
SLIDE 47

Why does `1 work?

slide-48
SLIDE 48

Why does `1 work?

slide-49
SLIDE 49

Why does `1 work?

slide-50
SLIDE 50

Why does `1 work?

slide-51
SLIDE 51

Why does `1 work?

slide-52
SLIDE 52

Why `1 may not always work

slide-53
SLIDE 53

`1 and nuclear balls

`1 ball nuclear ball

slide-54
SLIDE 54

Early use of `1 norm

Rich history in applied science Logan (50’s) Claerbout (70’s) Santosa and Symes (80’s) Donoho (90’s) Osher and Rudin (90’s) Tibshirani (90’s) Many since then Ben Logan (1927–) Mathematician Bluegrass music fiddler

slide-55
SLIDE 55

A Taste of Analysis: Geometry and Probability

slide-56
SLIDE 56

Geometry

C = {h : kx + thk  kxk for some t > 0} cone of descent Exact recovery if C \ null(A) = {0}

slide-57
SLIDE 57

Geometry

C = {h : kx + thk  kxk for some t > 0} cone of descent Exact recovery if C \ null(A) = {0}

slide-58
SLIDE 58

Geometry

slide-59
SLIDE 59

Gaussian models

Entries of A iid N(0, 1) ! row vectors a1, . . . , am are iid N(0, I) Important consequence: null(A) uniformly distributed P(C \ null(A) = {0}) volume calculation

slide-60
SLIDE 60

Volume calculations: geometric functional analysis

slide-61
SLIDE 61

Volume of a cone

Polar cone Co = {y : hy, zi  0 for all z 2 C}

C C

polar cone descent cone

g

Statistical dimension

(C) := Eg min

z2Co kg zk2 `2 = Eg k⇡C(g)k2 `2

g ⇠ N(0, I)

slide-62
SLIDE 62

Volume of a cone

Polar cone Co = {y : hy, zi  0 for all z 2 C}

C C

polar cone descent cone

g

polar cone descent cone

g

Statistical dimension

(C) := Eg min

z2Co kg zk2 `2 = Eg k⇡C(g)k2 `2

g ⇠ N(0, I)

slide-63
SLIDE 63

Gordon’s escape lemma

Theorem (Gordon ’88)

Convex cone K ⇢ Rn and m ⇥ n Gaussian matrix A. With prob. at least 1 et2/2 m |{z}

codim(null(A))

( p (K) + t)2 + 1 = ) null(A) \ K = {0}

slide-64
SLIDE 64

Gordon’s escape lemma

Theorem (Gordon ’88)

Convex cone K ⇢ Rn and m ⇥ n Gaussian matrix A. With prob. at least 1 et2/2 m |{z}

codim(null(A))

( p (K) + t)2 + 1 = ) null(A) \ K = {0} Implication: exact recovery if m (C) (roughly) [Rudelson & Vershynin (’08)]

slide-65
SLIDE 65

Gordon’s escape lemma

Theorem (Gordon ’88)

Convex cone K ⇢ Rn and m ⇥ n Gaussian matrix A. With prob. at least 1 et2/2 m |{z}

codim(null(A))

( p (K) + t)2 + 1 = ) null(A) \ K = {0} Implication: exact recovery if m (C) (roughly) [Rudelson & Vershynin (’08)] Gordon’s lemma originally stated with Gaussian width w(K) := Eg sup

z2K\Sn−1hg, zi

(K) 1  w2(K)  (K)

slide-66
SLIDE 66

Statistical dimension of `1 descent cone

Co is cone of subdifferential Co = {t u : t > 0 and u 2 @kxk} u 2 @kxk iff 8h kx + hk kxk + hu, hi

C C

polar cone descent cone

g

slide-67
SLIDE 67

Statistical dimension of `1 descent cone

x? = (⇤, ⇤, . . . , ⇤ | {z }

s times

, 0, 0 . . . , 0 | {z }

ns times

) u 2 @kx?k`1 ( ) ( ui = sgn(x?

i )

1  i  s |ui|  1 i > s

C C

polar cone descent cone

g

slide-68
SLIDE 68

Statistical dimension of `1 descent cone

x? = (⇤, ⇤, . . . , ⇤ | {z }

s times

, 0, 0 . . . , 0 | {z }

ns times

) u 2 @kx?k`1 ( ) ( ui = sgn(x?

i )

1  i  s |ui|  1 i > s

C C

polar cone descent cone

g

Eg min

z2Co kg zk2 `2

| {z }

(C)

= E inf

t0 u2@kx?k`1

8 < : X

is

(gi tui)2 + X

i>s

(gi tui)2 9 = ;

slide-69
SLIDE 69

Statistical dimension of `1 descent cone

x? = (⇤, ⇤, . . . , ⇤ | {z }

s times

, 0, 0 . . . , 0 | {z }

ns times

) u 2 @kx?k`1 ( ) ( ui = sgn(x?

i )

1  i  s |ui|  1 i > s

C C

polar cone descent cone

g

Eg min

z2Co kg zk2 `2

| {z }

(C)

= E inf

t0

8 < : X

is

(gi ± t)2 + X

i>s

(|gi| t)2

+

9 = ;

slide-70
SLIDE 70

Statistical dimension of `1 descent cone

x? = (⇤, ⇤, . . . , ⇤ | {z }

s times

, 0, 0 . . . , 0 | {z }

ns times

) u 2 @kx?k`1 ( ) ( ui = sgn(x?

i )

1  i  s |ui|  1 i > s

C C

polar cone descent cone

g

Eg min

z2Co kg zk2 `2

| {z }

(C)

 inf

t0

  • s · (1 + t2) + (n s) · E(|g1| t)2

+

slide-71
SLIDE 71

Statistical dimension of `1 descent cone

x? = (⇤, ⇤, . . . , ⇤ | {z }

s times

, 0, 0 . . . , 0 | {z }

ns times

) u 2 @kx?k`1 ( ) ( ui = sgn(x?

i )

1  i  s |ui|  1 i > s

C C

polar cone descent cone

g

Eg min

z2Co kg zk2 `2

| {z }

(C)

 inf

t0

  • s · (1 + t2) + (n s) · E(|g1| t)2

+

 2s log(n/s) + 2s | {z }

sufficient # of equations

Stojnic (’09); Chandrasekaharan, Recht, Parrilo, Willsky (’12)

slide-72
SLIDE 72

Phase transitions for Gaussian maps

Theorem (Amelunxen, Lotz, McCoy and Tropp ’13)

C is descent cone (norm k · k) at fixed x? 2 Rn. Then for a fixed " 2 (0, 1) m  (C) a" pn = )

  • cvx. prog. succeeds with prob.  "

m (C) + a" pn = )

  • cvx. prog. succeeds with prob. 1 "

a" = p 8 log(4/")

slide-73
SLIDE 73

Phase transitions for Gaussian maps

Theorem (Amelunxen, Lotz, McCoy and Tropp ’13)

C is descent cone (norm k · k) at fixed x? 2 Rn. Then for a fixed " 2 (0, 1) m  (C) a" pn = )

  • cvx. prog. succeeds with prob.  "

m (C) + a" pn = )

  • cvx. prog. succeeds with prob. 1 "

a" = p 8 log(4/")

25 50 75 100 25 50 75 100 10 20 30 300 600 900

slide-74
SLIDE 74

Phase transitions for Gaussian maps

Courtesy of Amelunxen, Lotz, McCoy and Tropp

25 50 75 100 25 50 75 100 10 20 30 300 600 900

Asymptotic phase transition for `1 recovery: Donoho (’06), Donoho & Tanner (’09)

slide-75
SLIDE 75

Discrete geometry approach (Donoho and Tanner ’06, ’09)

Cross-polytope P = {x 2 Rn : kxk`1  1} Projected polytope AP

e1 e2 e3 Range of A Ae 3 Ae 2 Ae 1

s-sparse x 2 (s 1)-dim face F of P `1 succeeds ( ) face F is conserved (AF: face of projected polytope)

slide-76
SLIDE 76

Discrete geometry approach (Donoho and Tanner ’06, ’09)

Cross-polytope P = {x 2 Rn : kxk`1  1} Projected polytope AP

e1 e2 e3 Range of A Ae 3 Ae 2 Ae 1

s-sparse x 2 (s 1)-dim face F of P `1 succeeds ( ) face F is conserved (AF: face of projected polytope) Integral geometry of convex sets: McMullen (’75), Gr¨ unbaum (’68) Polytope angle calculations: Vershik and Sporishev (’86, ’92), Affentranger and Schneider (’92)

slide-77
SLIDE 77

Non-Gaussian models

MRI Collaborative filtering Under incoherence, cvx. prog. succeeds if m |{z}

# eqns

& df · log n

slide-78
SLIDE 78

Dual certificates

min kxk s.t. y = Ax x solution iff there exists v ? null(A) and v 2 Co , v 2 @kxk

null(A) descent cone row(A) polar cone

slide-79
SLIDE 79

Dual certificates

min kxk s.t. y = Ax x solution iff there exists v ? null(A) and v 2 Co , v 2 @kxk

descent cone null(A) row(A) polar cone

slide-80
SLIDE 80

Sparse recovery

dual certificate v 2 row(A) = span(a1, . . . , am) and ( vi = sgn(xi) xi 6= 0 |vi|  1 xi = 0 Example: Fourier sampling ! ak(t) = ei2⇡!kt, !k random v(t) = X

k

ckei2⇡!kt | {z }

v2row(A)

and

+1

  • 1

sgn(x) (x 6= 0)

slide-81
SLIDE 81

Dual certificate construction

v 2 row(A) and ( Pv = sgn(x) k(I P)vk`∞  1 (Pv)i = ( vi xi 6= 0 xi = 0

slide-82
SLIDE 82

Dual certificate construction

v 2 row(A) and ( Pv = sgn(x) k(I P)vk`∞  1 (Pv)i = ( vi xi 6= 0 xi = 0

Candidate certificate

minimize kvk`2 subject to Pv = sgn(x) v 2 row(A) 9 = ; v = A⇤A(PA⇤AP)1sgn(x)

slide-83
SLIDE 83

Dual certificate construction

v 2 row(A) and ( Pv = sgn(x) k(I P)vk`∞  1 (Pv)i = ( vi xi 6= 0 xi = 0

Candidate certificate

minimize kvk`2 subject to Pv = sgn(x) v 2 row(A) 9 = ; v = A⇤A(PA⇤AP)1sgn(x) sgn(x) (x 6= 0)

slide-84
SLIDE 84

Dual certificate construction

v 2 row(A) and ( Pv = sgn(x) k(I P)vk`∞  1 (Pv)i = ( vi xi 6= 0 xi = 0

Candidate certificate

minimize kvk`2 subject to Pv = sgn(x) v 2 row(A) 9 = ; v = A⇤A(PA⇤AP)1sgn(x) Analysis via combinatorial methods

sparse signal recovery (C. Romberg and Tao, ’04) matrix completion (C. and Tao ’09)

Analysis for matrix completion via tools from geometric functional analysis (C. and Recht, ’08) Gives accurate answers in Gaussian case: m 2s log n (C. and Recht, ’12) Widely used since then

slide-85
SLIDE 85

Some Immediate and (Far) Less Immediate Applications

slide-86
SLIDE 86

Impact on MR pediatrics

Lustig (UCB), Pauly, Vasanawala (Stanford)

6 year old 8X acceleration 16 second scan 0.875 mm in-plane 1.6 slice thickness 32 channels

slide-87
SLIDE 87

1 year old female with liver lesions: 8X acceleration

Lustig (UCB), Pauly, Vasanawala (Stanford)

Parallel imaging (PI) Compressed sensing + PI

Lesions are barely seen with linear reconstruction

slide-88
SLIDE 88

6 year old male abdomen: 8X acceleration

Lustig (UCB), Pauly, Vasanawala (Stanford)

Parallel imaging (PI) Compressed sensing + PI

Fine structures (arrows) are buried in noise (artifacts + noise amplification) and recovered by CS (`1 + wavelets)

slide-89
SLIDE 89

6 year old male abdomen: 8X acceleration

Lustig (UCB), Pauly, Vasanawala (Stanford)

Parallel imaging (PI) Compressed sensing + PI

Fine structures (arrows) are buried in noise and recovered by CS

slide-90
SLIDE 90

Missing phase problem

Eyes and detectors see intensity But light is a wave ! has intensity and phase

Phase retrieval

find x 2 Cn subject to y = |Ax|2 (or yk = |hak, xi|2, k = 1, . . . , m)

slide-91
SLIDE 91

Origin in X-ray crystallography

10 Nobel Prizes in X-ray crystallography, and counting...

slide-92
SLIDE 92

Another look at phase retrieval

With Eldar, Strohmer and Voroninski find x subject to |hak, xi|2 = yk k = 1, . . . , m Solving quadratic equations is NP hard in general ! ad-hoc solutions

slide-93
SLIDE 93

Another look at phase retrieval

With Eldar, Strohmer and Voroninski find x subject to |hak, xi|2 = yk k = 1, . . . , m Solving quadratic equations is NP hard in general ! ad-hoc solutions Lifting : X = xx⇤ |hak, xi|2 = Tr(aka⇤

kxx⇤) := Tr(aka⇤ kX)

Phase retrieval problem

find X such that A(X) = y X ⌫ 0, rank(X) = 1

slide-94
SLIDE 94

Another look at phase retrieval

With Eldar, Strohmer and Voroninski find x subject to |hak, xi|2 = yk k = 1, . . . , m Solving quadratic equations is NP hard in general ! ad-hoc solutions Lifting : X = xx⇤ |hak, xi|2 = Tr(aka⇤

kxx⇤) := Tr(aka⇤ kX)

Phase retrieval problem

find X such that A(X) = y X ⌫ 0, rank(X) = 1

PhaseLift

minimize Tr(X) subject to A(X) = y X ⌫ 0

slide-95
SLIDE 95

Another look at phase retrieval

With Eldar, Strohmer and Voroninski find x subject to |hak, xi|2 = yk k = 1, . . . , m Solving quadratic equations is NP hard in general ! ad-hoc solutions Lifting : X = xx⇤ |hak, xi|2 = Tr(aka⇤

kxx⇤) := Tr(aka⇤ kX)

Phase retrieval problem

find X such that A(X) = y X ⌫ 0, rank(X) = 1

PhaseLift

minimize Tr(X) subject to A(X) = y X ⌫ 0 Other convex relaxations of quadratically constrained QP’s: Shor (87); Goemans and Williamson (95) [MAX-CUT]

slide-96
SLIDE 96

A surprise

Phase retrieval

find x

  • s. t.

yk = |hak, xi|2

PhaseLift

min Tr(X)

  • s. t.

A(X) = y, X ⌫ 0

slide-97
SLIDE 97

A surprise

Phase retrieval

find x

  • s. t.

yk = |hak, xi|2

PhaseLift

min Tr(X)

  • s. t.

A(X) = y, X ⌫ 0

Theorem (C. and Li (’12); C., Strohmer and Voroninski (’11))

ak independently and uniformly sampled on unit sphere m & n Then with prob. 1 O(em), only feasible point is xx⇤ {X : A(X) = y and X ⌫ 0} = {xx⇤}! Proof via construction of dual certificates

slide-98
SLIDE 98

A separation problem

Cand` es, Li Wright, Ma (’09) Chandrasekaran, Sanghavi, Parrilo, Willsky (’09)

Y = L + S Y : data matrix (observed) L: low-rank (unobserved) S: sparse (unobserved) 2 6 6 6 6 6 6 4 ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ 3 7 7 7 7 7 7 5

slide-99
SLIDE 99

A separation problem

Cand` es, Li Wright, Ma (’09) Chandrasekaran, Sanghavi, Parrilo, Willsky (’09)

Y = L + S Y : data matrix (observed) L: low-rank (unobserved) S: sparse (unobserved) 2 6 6 6 6 6 6 6 4 ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ 3 7 7 7 7 7 7 7 5

slide-100
SLIDE 100

A separation problem

Cand` es, Li Wright, Ma (’09) Chandrasekaran, Sanghavi, Parrilo, Willsky (’09)

Y = L + S Y : data matrix (observed) L: low-rank (unobserved) S: sparse (unobserved) 2 6 6 6 6 6 6 4 ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ 3 7 7 7 7 7 7 5 Can we recover L and S accurately? Looks impossible Recover low-dimensional structure from corrupted data: approach to robust principal component analysis (PCA)

slide-101
SLIDE 101

De-mixing by convex programming

Y = L + S L unknown (rank unknown) S unknown (# of entries 6= 0 unknown)

slide-102
SLIDE 102

De-mixing by convex programming

Y = L + S L unknown (rank unknown) S unknown (# of entries 6= 0 unknown)

Recovery via convex programming

minimize kˆ LkS1 + k ˆ Sk`1 subject to ˆ L + ˆ S = Y kSk`1 = X

ij

|Sij| See also Chandrasekaran, Sanghavi, Parrilo, Willsky (’09)

slide-103
SLIDE 103

A last surprise

minimize kˆ LkS1 + k ˆ Sk`1 subject to ˆ L + ˆ S = Y

Theorem (C., Li, Wright and Ma (’09))

L is n ⇥ n of rank(L)  ⇢rn (log n)2 and incoherent S is n ⇥ n, random sparsity pattern of cardinality at most ⇢sn2 Then with probability 1 O(n10), recovery with = 1/pn is exact: ˆ L = L, ˆ S = S Same conclusion for rectangular matrices with = 1/ p max dim

slide-104
SLIDE 104

A last surprise

minimize kˆ LkS1 + k ˆ Sk`1 subject to ˆ L + ˆ S = Y

Theorem (C., Li, Wright and Ma (’09))

L is n ⇥ n of rank(L)  ⇢rn (log n)2 and incoherent S is n ⇥ n, random sparsity pattern of cardinality at most ⇢sn2 Then with probability 1 O(n10), recovery with = 1/pn is exact: ˆ L = L, ˆ S = S Same conclusion for rectangular matrices with = 1/ p max dim No tuning parameter! Whatever the magnitudes of L and S Proof via dual certificates!

slide-105
SLIDE 105

Y L

}

slide-106
SLIDE 106

Time Space (pixels) ith frame

slide-107
SLIDE 107

L + S background subtraction

slide-108
SLIDE 108

L + S background subtraction

From GoDec

slide-109
SLIDE 109

L + S reconstruction of MR angiography

L + S L S automatic and improved background suppression

Joint with R. Otazo and D. Sodickson

slide-110
SLIDE 110

Free-breathing MRI of the liver

NUFFT Standard L + S Motion-Guided L + S 12.8 fold acceleration min kLkS1 + kSk`1

  • s. t.

A(L + S) = y

Joint with R. Otazo and D. Sodickson

slide-111
SLIDE 111

Free-breathing MRI of the liver

NUFFT Standard L + S Motion-Guided L + S Temporal blurring

Joint with R. Otazo and D. Sodickson

slide-112
SLIDE 112

Free-breathing MRI of the kidneys

NUFFT Standard L + S Motion-Guided L + S 12.8 fold acceleration min kLkS1 + kSk`1

  • s. t.

A(L + S) = y

Joint with R. Otazo and D. Sodickson

slide-113
SLIDE 113

Free-breathing MRI of the kidneys

NUFFT Standard L + S Motion-Guided L + S

Joint with R. Otazo and D. Sodickson

slide-114
SLIDE 114

Things I have not talked about

Deterministic results Approximate sparsity/low-rank Noise (inexact measurements) Computational issues Host of applications ...

with R. Yang

slide-115
SLIDE 115

with R. Yang

slide-116
SLIDE 116

Thanks!