Sparsity-aware sampling theorems and applications Rachel Ward - - PowerPoint PPT Presentation

sparsity aware sampling theorems and applications
SMART_READER_LITE
LIVE PREVIEW

Sparsity-aware sampling theorems and applications Rachel Ward - - PowerPoint PPT Presentation

Sparsity-aware sampling theorems and applications Rachel Ward University of Texas at Austin November, 2014 Sparsity-aware sampling: motivating example N = 100 , 000 soldiers should be screened for syphilis. Problem: Syphilis is rare (only


slide-1
SLIDE 1

Sparsity-aware sampling theorems and applications

Rachel Ward University of Texas at Austin November, 2014

slide-2
SLIDE 2

Sparsity-aware sampling: motivating example

Problem: N = 100, 000 soldiers should be screened for syphilis. Syphilis is rare (only about s = 10 expected out of 100, 000). Doing a blood test is expensive. Do we need to take N blood tests?

slide-3
SLIDE 3

Sparsity-aware sampling: motivating example

Problem: N = 100, 000 soldiers should be screened for syphilis. Syphilis is rare (only about s = 10 expected out of 100, 000). Doing a blood test is expensive. Do we need to take N blood tests? Idea: Pool blood together. Test a combined blood sample to check if at least one soldier has syphilis.

slide-4
SLIDE 4

Sparsity-aware sampling: motivating example

Problem: N = 100, 000 soldiers should be screened for syphilis. Syphilis is rare (only about s = 10 expected out of 100, 000). Doing a blood test is expensive. Do we need to take N blood tests? Idea: Pool blood together. Test a combined blood sample to check if at least one soldier has syphilis. Only only need take s log N ≪ N blood tests to identify infected

  • soldiers. (“compressed” measurements).

Implemented by the U.S. Government during WWII

slide-5
SLIDE 5

Compressive sensing

Main idea: Many natural signals / images of interest are sparse in some sense. We say x is s-sparse if x0 = #{j : |xj| > 0} ≤ s. Theory: from only m ≈ s log(N) incoherent linear measurements, can recover sparse signal as e.g. vector of minimal ℓ1-norm satisfying y = Φx

slide-6
SLIDE 6

Examples of sparsity: Natural images: Smooth function interpolation Low-rank matrices:

slide-7
SLIDE 7

Incoherent sampling

y = Ax Let (Φ, Ψ) is a pair of orthonormal bases of RN.

  • 1. Φ = (φj) is used for sensing: A ∈ Rm×N is a subset of m rows
  • f Φ
  • 2. Ψ = (ψk) is used to sparsely represent x: x = Ψ∗b, and b is

assumed sparse

Definition

The coherence between Φ and Ψ is µ(Φ, Ψ) = √ N max

1≤k,j≤N | < φj, ψk > |

slide-8
SLIDE 8

Incoherent sampling

y = Ax Let (Φ, Ψ) is a pair of orthonormal bases of RN.

  • 1. Φ = (φj) is used for sensing: A ∈ Rm×N is a subset of m rows
  • f Φ
  • 2. Ψ = (ψk) is used to sparsely represent x: x = Ψ∗b, and b is

assumed sparse

Definition

The coherence between Φ and Ψ is µ(Φ, Ψ) = √ N max

1≤k,j≤N | < φj, ψk > |

If µ(Φ, Ψ) = C a constant, then Φ and Ψ are called incoherent.

slide-9
SLIDE 9

Incoherent sampling

Example:

◮ Ψ = Identity. Signal is sparse in canonical/Kronecker basis ◮ Φ is discrete Fourier basis, φj =

  • 1

√ N ei2πjk/NN−1 k=0 ◮ The Kronecker and Fourier bases are incoherent:

µ(Φ, Ψ) := √ N max

j,k | < φj, ψk > | = 1.

slide-10
SLIDE 10

Why does ℓ1 minimization work?

slide-11
SLIDE 11

Why does ℓ1 minimization work?

slide-12
SLIDE 12

Why does ℓ1 minimization work?

slide-13
SLIDE 13

Why does ℓ1 minimization work?

slide-14
SLIDE 14

Why does ℓ1 minimization work?

slide-15
SLIDE 15

Why does ℓ1 minimization work?

slide-16
SLIDE 16

Why does ℓ1 minimization work?

slide-17
SLIDE 17

Reconstructing sparse signals

ℓ1-minimization

x# = arg min

z∈RN N

  • j=1

|zj| such that Az = Ax.

  • r, if x is sparse with respect to basis Ψ,

x# = arg min

z∈RN N

  • j=1

|(Ψ∗z)j| such that Az = Ax.

slide-18
SLIDE 18

Theorem (Sparse recovery via incoherent sampling1)

Let (Φ, Ψ) be a pair of incoherent orthonormal bases of RN. Select m (possibly not distinct) rows of Φ i.i.d. uniformly to form A : RN → Rm, where m Cs log(N). With exceedingly high probability, the following holds: for all x ∈ RN such that Ψ∗x is s-sparse, x = arg min

z∈RN N

  • j=1

|(Ψ∗z)j| such that Az = Ax. Such reconstruction is also stable to sparsity defects and robust to noise.

1Cand`

es, Romberg, Tao ’06, Rudelson Vershynin ’08, ...

slide-19
SLIDE 19

Theory is largely restricted to: incoherent measurement/sparsity bases, finite-dimensional spaces, and sparsity in orthonormal representations; not sufficient for key examples Current research directions:

  • 1. Importance sampling for compressive sensing applications
  • 2. Adaptive sampling strategies
  • 3. Extend theory from sparsity in orthonormal bases to sparsity

in redundant dictionaries

  • 4. Extend theory from finite-dimensional spaces to

infinite-dimensional spaces

slide-20
SLIDE 20

Compressive imaging

In MRI, one cannot observe the N = n × n pixel image directly; can

  • nly take samples from 2D (or 3D) discrete Fourier transform F.

So we can acquire a number m ≪ N linear measurements of the form yk1,k2 = (Fx)k1,k2 = 1 n

  • j1,j2

xj1,j2e2πi(k1j1+k2j2)/n, −n/2+1 ≤ k1, k2, ≤ n/2 Smaller m means faster MRI scan! How to subsample in frequency domain?

slide-21
SLIDE 21

In the MRI setting ... random sampling fails

Reconstructions of an image from m = .1N frequency measurements using total variation minimization.

Pixel space / Frequency space

50 100 150 200 250

Reconstruction from lowest frequencies Reconstruction from uniformly subsampled frequencies

slide-22
SLIDE 22

In the MRI setting ... random sampling fails

Image Natural images are sparsely represented in 2D wavelet bases Ψ Possible sensing measurements are Fourier measurements Φ This is because wavelet and Fourier bases are not incoherent

slide-23
SLIDE 23

Importance sampling

Image domain / Fourier domain

50 100 150 200 250

(a) Full sampling

50 100 150 200 250

(b) Uniform random

50 100 150 200 250 50 100 150 200 250

(c) Radial line sampling

50 100 150 200 250 50 100 150 200 250

(d) Variable-density Used in MRI: radial-line sampling. New: “importance sampling”: take random samples according to an inverse-square distance variable density: Draw frequency (k1, k2) with probability ∝

1 k2

1 +k2 2 .

With variable density sampling, can extend compressed sensing results and prove that m s log(N) 2D DFT measurements suffice for recovering images with s-sparse wavelet expansions.

slide-24
SLIDE 24

Examples of sparsity: Natural images: Smooth function interpolation Low-rank matrices:

slide-25
SLIDE 25

High-dimensional function interpolation

Given a function f : D → C on a d-dimensional domain D, reconstruct or interpolate f from sample values f (t1), . . . , f (tm).

1 0.5 0.5 1 0.2 0.4 0.6 0.8 1 Original function −1 −0.5 0.5 1 −0.5 0.5 1 Least squares solution − − − −1 −0.5 0.5 1 0.2 0.4 0.6 0.8 1 Unweighted l1 minimizer − − −

Assume the form f (t) =

j∈Γ xjψj(t) where x has assumed

structure:

  • 1. Sparsity: x0 := {ℓ : xℓ = 0} ≤ s
  • 2. Smoothness: coefficient decay

j jr|xj| < ∞.

slide-26
SLIDE 26

High-dimensional function interpolation

Given a function f : D → C on a d-dimensional domain D, reconstruct or interpolate f from sample values f (t1), . . . , f (tm).

1 0.5 0.5 1 0.2 0.4 0.6 0.8 1 Original function −1 −0.5 0.5 1 −0.5 0.5 1 Least squares solution − − − −1 −0.5 0.5 1 0.2 0.4 0.6 0.8 1 Unweighted l1 minimizer − − −

Assume the form f (t) =

j∈Γ xjψj(t) where x has assumed

structure:

  • 1. Sparsity: x0 := {ℓ : xℓ = 0} ≤ s
  • 2. Smoothness: coefficient decay

j jr|xj| < ∞.

Smoothness assumption not strong enough to overcome curse of dimensionality: need m ≈ ( 1

ε)d/r sample values for accuracy ε.

slide-27
SLIDE 27

High-dimensional function interpolation

Given a function f : D → C on a d-dimensional domain D, reconstruct or interpolate f from sample values f (t1), . . . , f (tm).

1 0.5 0.5 1 0.2 0.4 0.6 0.8 1 Original function −1 −0.5 0.5 1 −0.5 0.5 1 Least squares solution − − − −1 −0.5 0.5 1 0.2 0.4 0.6 0.8 1 Unweighted l1 minimizer − − −

Assume the form f (t) =

j∈Γ xjψj(t) where x has assumed

structure:

  • 1. Sparsity: x0 := {ℓ : xℓ = 0} ≤ s
  • 2. Smoothness: coefficient decay

j jr|xj| < ∞.

Smoothness assumption not strong enough to overcome curse of dimensionality: need m ≈ ( 1

ε)d/r sample values for accuracy ε.

Our work: combine smoothness + sparsity for weighted ℓ1-coefficient function spaces. m ≈ ( 1

ε)s log3(s) samples sufficient

to reconstruct such a function, independent of dimension d

slide-28
SLIDE 28

Low-rank matrix completion / approximation

Previous results: a rank-r incoherent n × n matrix M may be completed (via convex optimization) from m ≈ nr log2(n) uniformly sampled entries Our results: An arbitrary rank-r matrix M may be completed (via convex optimization) from m ≈ nr log(n) entries, sampled according to a specific non-uniform distribution adapted to the matrix leverage scores. Also: extensions to only approximately low-rank matrices, two-stage adaptive sampling

slide-29
SLIDE 29

Summary

Compressed sensing and related optimization problems often assume incoherence between the sensing and sparsity bases to derive sparse recovery guarantees. Incoherence is restrictive and not achievable in many problems of practical interest. With small local coherence from one basis to another, one may derive sampling strategies and sparse recovery results for a wide range of new sensing problems (imaging, matrix completion, ...) Also: weighted sparsity, measurement error, adaptive sampling ...