Adaptive Sparse Recovery with Limited Adaptivity Akshay Kamath Eric - - PowerPoint PPT Presentation

adaptive sparse recovery with limited adaptivity
SMART_READER_LITE
LIVE PREVIEW

Adaptive Sparse Recovery with Limited Adaptivity Akshay Kamath Eric - - PowerPoint PPT Presentation

Adaptive Sparse Recovery with Limited Adaptivity Akshay Kamath Eric Price UT Austin 2018-11-27 Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 1 / 33 Outline Introduction 1 Analysis for k = 1 2


slide-1
SLIDE 1

Adaptive Sparse Recovery with Limited Adaptivity

Akshay Kamath Eric Price

UT Austin

2018-11-27

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 1 / 33

slide-2
SLIDE 2

Outline

1

Introduction

2

Analysis for k = 1

3

General k: lower bound

4

General k: upper bound

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 2 / 33

slide-3
SLIDE 3

Outline

1

Introduction

2

Analysis for k = 1

3

General k: lower bound

4

General k: upper bound

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 3 / 33

slide-4
SLIDE 4

Sparsity

An n-dimensional vector x is “k-sparse” if only k non-zero coefficients.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 4 / 33

slide-5
SLIDE 5

Sparsity

An n-dimensional vector x is “k-sparse” if only k non-zero coefficients. “Approximate sparsity:” vector “close” to a sparse vector

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 4 / 33

slide-6
SLIDE 6

Sparsity

An n-dimensional vector x is “k-sparse” if only k non-zero coefficients. “Approximate sparsity:” vector “close” to a sparse vector Approximate sparsity is a common form of structure.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 4 / 33

slide-7
SLIDE 7

Sparsity

An n-dimensional vector x is “k-sparse” if only k non-zero coefficients. “Approximate sparsity:” vector “close” to a sparse vector Approximate sparsity is a common form of structure.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 4 / 33

slide-8
SLIDE 8

Sparsity

An n-dimensional vector x is “k-sparse” if only k non-zero coefficients. “Approximate sparsity:” vector “close” to a sparse vector Approximate sparsity is a common form of structure. Images sparse in wavelet basis

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 4 / 33

slide-9
SLIDE 9

Sparse Recovery / Compressive Sensing

AKA heavy hitters/frequency estimation in turnstile streams

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 5 / 33

slide-10
SLIDE 10

Sparse Recovery / Compressive Sensing

AKA heavy hitters/frequency estimation in turnstile streams

Suppose an n-dimensional vector x is k-sparse in known basis.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 5 / 33

slide-11
SLIDE 11

Sparse Recovery / Compressive Sensing

AKA heavy hitters/frequency estimation in turnstile streams

Suppose an n-dimensional vector x is k-sparse in known basis. Observe Ax, a set of m << n linear products.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 5 / 33

slide-12
SLIDE 12

Sparse Recovery / Compressive Sensing

AKA heavy hitters/frequency estimation in turnstile streams

Suppose an n-dimensional vector x is k-sparse in known basis. Observe Ax, a set of m << n linear products. Why linear? Many applications:

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 5 / 33

slide-13
SLIDE 13

Sparse Recovery / Compressive Sensing

AKA heavy hitters/frequency estimation in turnstile streams

Suppose an n-dimensional vector x is k-sparse in known basis. Observe Ax, a set of m << n linear products. Why linear? Many applications:

◮ Genetic testing: mixing blood samples. Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 5 / 33

slide-14
SLIDE 14

Sparse Recovery / Compressive Sensing

AKA heavy hitters/frequency estimation in turnstile streams

Suppose an n-dimensional vector x is k-sparse in known basis. Observe Ax, a set of m << n linear products. Why linear? Many applications:

◮ Genetic testing: mixing blood samples. ◮ Streaming updates: A(x + ∆) = Ax + A∆. Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 5 / 33

slide-15
SLIDE 15

Sparse Recovery / Compressive Sensing

AKA heavy hitters/frequency estimation in turnstile streams

Suppose an n-dimensional vector x is k-sparse in known basis. Observe Ax, a set of m << n linear products. Why linear? Many applications:

◮ Genetic testing: mixing blood samples. ◮ Streaming updates: A(x + ∆) = Ax + A∆. ◮ Camera optics: filter in front of lens. Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 5 / 33

slide-16
SLIDE 16

Sparse Recovery / Compressive Sensing

AKA heavy hitters/frequency estimation in turnstile streams

Suppose an n-dimensional vector x is k-sparse in known basis. Observe Ax, a set of m << n linear products. Why linear? Many applications:

◮ Genetic testing: mixing blood samples. ◮ Streaming updates: A(x + ∆) = Ax + A∆. ◮ Camera optics: filter in front of lens.

Goal is to robustly recover x from Ax.

◮ Informally: get close to x if x is close to k-sparse. Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 5 / 33

slide-17
SLIDE 17

Sparse Recovery / Compressive Sensing

AKA heavy hitters/frequency estimation in turnstile streams

Suppose an n-dimensional vector x is k-sparse in known basis. Observe Ax, a set of m << n linear products. Why linear? Many applications:

◮ Genetic testing: mixing blood samples. ◮ Streaming updates: A(x + ∆) = Ax + A∆. ◮ Camera optics: filter in front of lens.

Goal is to robustly recover x from Ax.

◮ Informally: get close to x if x is close to k-sparse.

Extremely well studied: thousands of papers.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 5 / 33

slide-18
SLIDE 18

Standard Sparse Recovery Framework

Specify distribution on m × n matrices A (independent of x). Given linear sketch Ax, recover x. Satisfying the recovery guarantee:

  • x − x2 C

min

k-sparse xk

x − xk2 with probability 2/3.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 6 / 33

slide-19
SLIDE 19

Standard Sparse Recovery Framework

Specify distribution on m × n matrices A (independent of x). Given linear sketch Ax, recover x. Satisfying the recovery guarantee:

  • x − x2 C

min

k-sparse xk

x − xk2 with probability 2/3. Solvable with Θ(k log n

k ) measurements [C`

andes-Romberg-Tao ’06].

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 6 / 33

slide-20
SLIDE 20

Standard Sparse Recovery Framework

Specify distribution on m × n matrices A (independent of x).

◮ Choose matrix Ai based on previous observations (possibly

randomized).

◮ Observe Aix. ◮ Number of measurements m is total number of rows in all Ai. ◮ Number of rounds is R.

Given linear sketch Ax, recover x. Satisfying the recovery guarantee:

  • x − x2 C

min

k-sparse xk

x − xk2 with probability 2/3. Solvable with Θ(k log n

k ) measurements [C`

andes-Romberg-Tao ’06].

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 6 / 33

slide-21
SLIDE 21

Standard Sparse Recovery Framework

Specify distribution on m × n matrices A (independent of x).

◮ Choose matrix Ai based on previous observations (possibly

randomized).

◮ Observe Aix. ◮ Number of measurements m is total number of rows in all Ai. ◮ Number of rounds is R.

Given linear sketch Ax, recover x. Satisfying the recovery guarantee:

  • x − x2 C

min

k-sparse xk

x − xk2 with probability 2/3. Solvable with Θ(k log n

k ) measurements [C`

andes-Romberg-Tao ’06]. Solvable in O(k log log n

k ) [Indyk-Price-Woodruff ’11].

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 6 / 33

slide-22
SLIDE 22

Prior Work

Nonadaptively: m 1

εk log n for C = 1 + ε.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 7 / 33

slide-23
SLIDE 23

Prior Work

Nonadaptively: m 1

εk log n for C = 1 + ε.

One line of work: ε = o(1) for m k log n.

◮ [Malioutov, Sanghavi, Willski ’08], [Castro, Haupt, Nowak, Raz ’08],

[Haupt, Castro, Nowak ’11], [Haupt, Baraniuk, Castro, Nowak ’12]

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 7 / 33

slide-24
SLIDE 24

Prior Work

Nonadaptively: m 1

εk log n for C = 1 + ε.

One line of work: ε = o(1) for m k log n.

◮ [Malioutov, Sanghavi, Willski ’08], [Castro, Haupt, Nowak, Raz ’08],

[Haupt, Castro, Nowak ’11], [Haupt, Baraniuk, Castro, Nowak ’12]

Another line: also allows m ≪ k log n.

◮ [Indyk-Price-Woodruff ’11], [Nakos, Shi, Woodruff, Zhang ’18]

m log log(1/ε) ε k + k log log n

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 7 / 33

slide-25
SLIDE 25

Prior Work

Nonadaptively: m 1

εk log n for C = 1 + ε.

One line of work: ε = o(1) for m k log n.

◮ [Malioutov, Sanghavi, Willski ’08], [Castro, Haupt, Nowak, Raz ’08],

[Haupt, Castro, Nowak ’11], [Haupt, Baraniuk, Castro, Nowak ’12]

Another line: also allows m ≪ k log n.

◮ [Indyk-Price-Woodruff ’11], [Nakos, Shi, Woodruff, Zhang ’18]

m log log(1/ε) ε k + k log log n

Lower bounds:

◮ [Arias-Castro, Cand`

es, Davenport ’13]: m 1

εk

◮ [Price, Woodruff ’13]: m log log n. Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 7 / 33

slide-26
SLIDE 26

Results in adaptive sparse recovery, C = O(1)

Unlimited adaptivity: with unlimited rounds, k + log log n m∗ k · log log n

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 8 / 33

slide-27
SLIDE 27

Results in adaptive sparse recovery, C = O(1)

Unlimited adaptivity: with unlimited rounds, k + log log n m∗ k · log log n Limited adaptivity: with R = O(1) rounds, k + log1/R n m∗ k · log1/(R−3) n.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 8 / 33

slide-28
SLIDE 28

Results in adaptive sparse recovery, C = O(1)

Unlimited adaptivity: with unlimited rounds, k + log log n m∗ k · log log n Limited adaptivity: with R = O(1) rounds, k + log1/R n m∗ k · log1/(R−3) n. New results: with R = O(1) rounds, k · log1/R n m∗ k · log1/R n log∗ k With caveat:

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 8 / 33

slide-29
SLIDE 29

Results in adaptive sparse recovery, C = O(1)

Unlimited adaptivity: with unlimited rounds, k + log log n m∗ k · log log n Limited adaptivity: with R = O(1) rounds, k + log1/R n m∗ k · log1/(R−3) n. New results: with R = O(1) rounds, k · log1/R n m∗ k · log1/R n log∗ k With caveat: the lower bound only applies for k < 2log1/R n ⇐ ⇒ m∗ > k log k.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 8 / 33

slide-30
SLIDE 30

Results in adaptive sparse recovery, C = O(1)

Unlimited adaptivity: with unlimited rounds, k + log log n m∗ k · log log n Limited adaptivity: with R = O(1) rounds, k + log1/R n m∗ k · log1/(R−3) n. New results: with R = O(1) rounds, k · log1/R n m∗ k · log1/R n log∗ k With caveat: the lower bound only applies for k < 2log1/R n ⇐ ⇒ m∗ > k log k. For k < no(1), m∗ = ω(k).

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 8 / 33

slide-31
SLIDE 31

Outline

1

Introduction

2

Analysis for k = 1

3

General k: lower bound

4

General k: upper bound

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 9 / 33

slide-32
SLIDE 32

Well-understood setting: k = 1

Theorem (Indyk-Price-Woodruff ’11, Price-Woodruff ’13)

R-round 1-sparse recovery requires Θ(R log1/R n) measurements.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 10 / 33

slide-33
SLIDE 33

Well-understood setting: k = 1

Theorem (Indyk-Price-Woodruff ’11, Price-Woodruff ’13)

R-round 1-sparse recovery requires Θ(R log1/R n) measurements. Outline of this section:

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 10 / 33

slide-34
SLIDE 34

Well-understood setting: k = 1

Theorem (Indyk-Price-Woodruff ’11, Price-Woodruff ’13)

R-round 1-sparse recovery requires Θ(R log1/R n) measurements. Outline of this section:

◮ R = 1 lower bound: Ω(log n). Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 10 / 33

slide-35
SLIDE 35

Well-understood setting: k = 1

Theorem (Indyk-Price-Woodruff ’11, Price-Woodruff ’13)

R-round 1-sparse recovery requires Θ(R log1/R n) measurements. Outline of this section:

◮ R = 1 lower bound: Ω(log n). ◮ Adaptive upper bound: O(log log n). Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 10 / 33

slide-36
SLIDE 36

Well-understood setting: k = 1

Theorem (Indyk-Price-Woodruff ’11, Price-Woodruff ’13)

R-round 1-sparse recovery requires Θ(R log1/R n) measurements. Outline of this section:

◮ R = 1 lower bound: Ω(log n). ◮ Adaptive upper bound: O(log log n). ◮ Adaptive lower bound: Ω(log log n). Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 10 / 33

slide-37
SLIDE 37

Well-understood setting: k = 1

Theorem (Indyk-Price-Woodruff ’11, Price-Woodruff ’13)

R-round 1-sparse recovery requires Θ(R log1/R n) measurements. Outline of this section:

◮ R = 1 lower bound: Ω(log n). ◮ Adaptive upper bound: O(log log n). ◮ Adaptive lower bound: Ω(log log n).

Hard case: x is random ez plus Gaussian noise w with w2 ≈ 1.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 10 / 33

slide-38
SLIDE 38

Well-understood setting: k = 1

Theorem (Indyk-Price-Woodruff ’11, Price-Woodruff ’13)

R-round 1-sparse recovery requires Θ(R log1/R n) measurements. Outline of this section:

◮ R = 1 lower bound: Ω(log n). ◮ Adaptive upper bound: O(log log n). ◮ Adaptive lower bound: Ω(log log n).

Hard case: x is random ez plus Gaussian noise w with w2 ≈ 1. Robust recovery must locate z.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 10 / 33

slide-39
SLIDE 39

Well-understood setting: k = 1

Theorem (Indyk-Price-Woodruff ’11, Price-Woodruff ’13)

R-round 1-sparse recovery requires Θ(R log1/R n) measurements. Outline of this section:

◮ R = 1 lower bound: Ω(log n). ◮ Adaptive upper bound: O(log log n). ◮ Adaptive lower bound: Ω(log log n).

Hard case: x is random ez plus Gaussian noise w with w2 ≈ 1. Robust recovery must locate z. Observations v, x = vz + v, w = vz + v2

√n z, for z ∼ N(0, 1).

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 10 / 33

slide-40
SLIDE 40

1-sparse recovery: non-adaptive lower bound

Observe v, x = vz + v2

√n z, where z ∼ N(0, Θ(1))

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 11 / 33

slide-41
SLIDE 41

1-sparse recovery: non-adaptive lower bound

Observe v, x = vz + v2

√n z, where z ∼ N(0, Θ(1))

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 11 / 33

slide-42
SLIDE 42

1-sparse recovery: non-adaptive lower bound

Observe v, x = vz + v2

√n z, where z ∼ N(0, Θ(1))

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 11 / 33

slide-43
SLIDE 43

1-sparse recovery: non-adaptive lower bound

Observe v, x = vz + v2

√n z, where z ∼ N(0, Θ(1))

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 11 / 33

slide-44
SLIDE 44

1-sparse recovery: non-adaptive lower bound

Observe v, x = vz + v2

√n z, where z ∼ N(0, Θ(1))

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 11 / 33

slide-45
SLIDE 45

1-sparse recovery: non-adaptive lower bound

Observe v, x = vz + v2

√n z, where z ∼ N(0, Θ(1))

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 11 / 33

slide-46
SLIDE 46

1-sparse recovery: non-adaptive lower bound

Observe v, x = vz + v2

√n z, where z ∼ N(0, Θ(1))

Shannon-Hartley theorem: AWGN channel capacity is I(z, v, x) 1 2 log(1 + SNR) where SNR denotes the “signal-to-noise ratio,”

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 11 / 33

slide-47
SLIDE 47

1-sparse recovery: non-adaptive lower bound

Observe v, x = vz + v2

√n z, where z ∼ N(0, Θ(1))

Shannon-Hartley theorem: AWGN channel capacity is I(z, v, x) 1 2 log(1 + SNR) where SNR denotes the “signal-to-noise ratio,” SNR = E[signal2] E[noise2] E[v2

z ]

v2

2/n

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 11 / 33

slide-48
SLIDE 48

1-sparse recovery: non-adaptive lower bound

Observe v, x = vz + v2

√n z, where z ∼ N(0, Θ(1))

Shannon-Hartley theorem: AWGN channel capacity is I(z, v, x) 1 2 log(1 + SNR) where SNR denotes the “signal-to-noise ratio,” SNR = E[signal2] E[noise2] E[v2

z ]

v2

2/n = 1

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 11 / 33

slide-49
SLIDE 49

1-sparse recovery: non-adaptive lower bound

Observe v, x = vz + v2

√n z, where z ∼ N(0, Θ(1))

Shannon-Hartley theorem: AWGN channel capacity is I(z, v, x) 1 2 log(1 + SNR) where SNR denotes the “signal-to-noise ratio,” SNR = E[signal2] E[noise2] E[v2

z ]

v2

2/n = 1

Finding z needs Ω(log n) non-adaptive measurements.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 11 / 33

slide-50
SLIDE 50

1-sparse recovery: changes in adaptive setting

Information capacity I(z, v, x) 1 2 log(1 + SNR). where SNR denotes the “signal-to-noise ratio,” SNR E[v2

z ]

v2

2/n.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 12 / 33

slide-51
SLIDE 51

1-sparse recovery: changes in adaptive setting

Information capacity I(z, v, x) 1 2 log(1 + SNR). where SNR denotes the “signal-to-noise ratio,” SNR E[v2

z ]

v2

2/n.

If z is independent of v, this is 1.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 12 / 33

slide-52
SLIDE 52

1-sparse recovery: changes in adaptive setting

Information capacity I(z, v, x) 1 2 log(1 + SNR). where SNR denotes the “signal-to-noise ratio,” SNR E[v2

z ]

v2

2/n.

If z is independent of v, this is 1. As we learn about z, we can increase the SNR.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 12 / 33

slide-53
SLIDE 53

1-sparse recovery: adaptive upper bound

x = ez + w

0 bits v Candidate set Signal

SNR = 2 I(z, v, x) log SNR = 1 v, x = vz + v, w

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 13 / 33

slide-54
SLIDE 54

1-sparse recovery: adaptive upper bound

x = ez + w

0 bits 1 bit v Candidate set Signal

SNR = 22 I(z, v, x) log SNR = 2 v, x = vz + v, w

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 13 / 33

slide-55
SLIDE 55

1-sparse recovery: adaptive upper bound

x = ez + w

0 bits 1 bit 2 bits v Candidate set Signal

SNR = 24 I(z, v, x) log SNR = 4 v, x = vz + v, w

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 13 / 33

slide-56
SLIDE 56

1-sparse recovery: adaptive upper bound

x = ez + w

0 bits 1 bit 2 bits 4 bits v Candidate set Signal

SNR = 28 I(z, v, x) log SNR = 8 v, x = vz + v, w

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 13 / 33

slide-57
SLIDE 57

1-sparse recovery: adaptive upper bound

x = ez + w

0 bits 1 bit 2 bits 4 bits 8 bits v Candidate set Signal

SNR = 216 I(z, v, x) log SNR = 16 v, x = vz + v, w

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 13 / 33

slide-58
SLIDE 58

1-sparse recovery: adaptive lower bound

Review of upper bound:

◮ Given b bits of information about z. ◮ Identifies z to set of size n/2b. ◮ Increases SNR, E[v 2

z ], by 2b.

◮ Recover b bits of information in one measurement. ◮ 1 → 2 → · · · → log n in log log n measurements. ◮ R = 2: 1 → √log n → log n in √log n measurements/round. Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 14 / 33

slide-59
SLIDE 59

1-sparse recovery: adaptive lower bound

Review of upper bound:

◮ Given b bits of information about z. ◮ Identifies z to set of size n/2b. ◮ Increases SNR, E[v 2

z ], by 2b.

◮ Recover b bits of information in one measurement. ◮ 1 → 2 → · · · → log n in log log n measurements. ◮ R = 2: 1 → √log n → log n in √log n measurements/round.

Lower bound outline:

◮ At each stage, have posterior distribution p on z. ◮ b = log n − H(p) bits known. Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 14 / 33

slide-60
SLIDE 60

1-sparse recovery: adaptive lower bound

Review of upper bound:

◮ Given b bits of information about z. ◮ Identifies z to set of size n/2b. ◮ Increases SNR, E[v 2

z ], by 2b.

◮ Recover b bits of information in one measurement. ◮ 1 → 2 → · · · → log n in log log n measurements. ◮ R = 2: 1 → √log n → log n in √log n measurements/round.

Lower bound outline:

◮ At each stage, have posterior distribution p on z. ◮ b = log n − H(p) bits known.

Lemma (Key lemma for k = 1)

For any measurement vector v, I(z; v, x) b + 1

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 14 / 33

slide-61
SLIDE 61

1-sparse recovery: adaptive lower bound

Lower bound outline:

◮ At each stage, have posterior distribution p on z. ◮ b = log n − H(p) bits known. ◮ Show any measurement gives O(b + 1) bits of information. Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 15 / 33

slide-62
SLIDE 62

1-sparse recovery: adaptive lower bound

Lower bound outline:

◮ At each stage, have posterior distribution p on z. ◮ b = log n − H(p) bits known. ◮ Show any measurement gives O(b + 1) bits of information.

Shannon-Hartley: I(z; v, x) 1 2 log(1 + SNR) 1 + log v2

z pz

v2

z /n 1 + np∞

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 15 / 33

slide-63
SLIDE 63

1-sparse recovery: adaptive lower bound

Lower bound outline:

◮ At each stage, have posterior distribution p on z. ◮ b = log n − H(p) bits known. ◮ Show any measurement gives O(b + 1) bits of information.

Shannon-Hartley: I(z; v, x) 1 2 log(1 + SNR) 1 + log v2

z pz

v2

z /n 1 + np∞

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 15 / 33

slide-64
SLIDE 64

1-sparse recovery: adaptive lower bound

Lower bound outline:

◮ At each stage, have posterior distribution p on z. ◮ b = log n − H(p) bits known. ◮ Show any measurement gives O(b + 1) bits of information.

Shannon-Hartley: I(z; v, x) 1 2 log(1 + SNR) 1 + log v2

z pz

v2

z /n 1 + np∞

Bound is good (SNR ≈ 2b) when nonzero pz are similar.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 15 / 33

slide-65
SLIDE 65

1-sparse recovery: adaptive lower bound

Lower bound outline:

◮ At each stage, have posterior distribution p on z. ◮ b = log n − H(p) bits known. ◮ Show any measurement gives O(b + 1) bits of information.

Shannon-Hartley: I(z; v, x) 1 2 log(1 + SNR) 1 + log v2

z pz

v2

z /n 1 + np∞

Bound is good (SNR ≈ 2b) when nonzero pz are similar. Can be terrible in general: b = 1 but SNR = n/ log n.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 15 / 33

slide-66
SLIDE 66

1-sparse recovery: adaptive lower bound

Lower bound outline:

◮ At each stage, have posterior distribution p on z. ◮ b = log n − H(p) = pz log npz bits known. ◮ Show any measurement gives O(b + 1) bits of information. Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 16 / 33

slide-67
SLIDE 67

1-sparse recovery: adaptive lower bound

Lower bound outline:

◮ At each stage, have posterior distribution p on z. ◮ b = log n − H(p) = pz log npz bits known. ◮ Show any measurement gives O(b + 1) bits of information.

Partition indices into “level sets” S0, S1, . . . ⊆ [n] of p:

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 16 / 33

slide-68
SLIDE 68

1-sparse recovery: adaptive lower bound

Lower bound outline:

◮ At each stage, have posterior distribution p on z. ◮ b = log n − H(p) = pz log npz bits known. ◮ Show any measurement gives O(b + 1) bits of information.

Partition indices into “level sets” S0, S1, . . . ⊆ [n] of p:

◮ SJ = {z | pz ∈ [2J/n, 2J+1/n]} Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 16 / 33

slide-69
SLIDE 69

1-sparse recovery: adaptive lower bound

Lower bound outline:

◮ At each stage, have posterior distribution p on z. ◮ b = log n − H(p) = pz log npz bits known. ◮ Show any measurement gives O(b + 1) bits of information.

Partition indices into “level sets” S0, S1, . . . ⊆ [n] of p:

◮ SJ = {z | pz ∈ [2J/n, 2J+1/n]} ◮ E[J] b. Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 16 / 33

slide-70
SLIDE 70

1-sparse recovery: adaptive lower bound

Lower bound outline:

◮ At each stage, have posterior distribution p on z. ◮ b = log n − H(p) = pz log npz bits known. ◮ Show any measurement gives O(b + 1) bits of information.

Partition indices into “level sets” S0, S1, . . . ⊆ [n] of p:

◮ SJ = {z | pz ∈ [2J/n, 2J+1/n]} ◮ E[J] b.

I(z; v, x) I(z; v, x | J) + H(J).

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 16 / 33

slide-71
SLIDE 71

1-sparse recovery: adaptive lower bound

Lower bound outline:

◮ At each stage, have posterior distribution p on z. ◮ b = log n − H(p) = pz log npz bits known. ◮ Show any measurement gives O(b + 1) bits of information.

Partition indices into “level sets” S0, S1, . . . ⊆ [n] of p:

◮ SJ = {z | pz ∈ [2J/n, 2J+1/n]} ◮ E[J] b.

I(z; v, x) I(z; v, x | J) + H(J). Shannon-Hartley: I(z; v, x | J = j) j + 1.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 16 / 33

slide-72
SLIDE 72

1-sparse recovery: adaptive lower bound

Lower bound outline:

◮ At each stage, have posterior distribution p on z. ◮ b = log n − H(p) = pz log npz bits known. ◮ Show any measurement gives O(b + 1) bits of information.

Partition indices into “level sets” S0, S1, . . . ⊆ [n] of p:

◮ SJ = {z | pz ∈ [2J/n, 2J+1/n]} ◮ E[J] b.

I(z; v, x) I(z; v, x | J) + H(J). Shannon-Hartley: I(z; v, x | J = j) j + 1.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 16 / 33

slide-73
SLIDE 73

1-sparse recovery: adaptive lower bound

Lower bound outline:

◮ At each stage, have posterior distribution p on z. ◮ b = log n − H(p) = pz log npz bits known. ◮ Show any measurement gives O(b + 1) bits of information.

Partition indices into “level sets” S0, S1, . . . ⊆ [n] of p:

◮ SJ = {z | pz ∈ [2J/n, 2J+1/n]} ◮ E[J] b.

I(z; v, x) I(z; v, x | J) + H(J). Shannon-Hartley: I(z; v, x | J = j) j + 1.

Lemma (Key lemma for k = 1)

I(z; v, x) b + 1

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 16 / 33

slide-74
SLIDE 74

1-sparse recovery: adaptive lower bound: finishing up

Lemma (Key lemma for k = 1)

I(z; v, x) b + 1

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 17 / 33

slide-75
SLIDE 75

1-sparse recovery: adaptive lower bound: finishing up

Lemma (Key lemma for k = 1)

I(z; v, x) b + 1 Suppose two rounds with m measurements each.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 17 / 33

slide-76
SLIDE 76

1-sparse recovery: adaptive lower bound: finishing up

Lemma (Key lemma for k = 1)

I(z; v, x) b + 1 Suppose two rounds with m measurements each.

◮ O(m) bits learned in first round. Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 17 / 33

slide-77
SLIDE 77

1-sparse recovery: adaptive lower bound: finishing up

Lemma (Key lemma for k = 1)

I(z; v, x) b + 1 Suppose two rounds with m measurements each.

◮ O(m) bits learned in first round. ◮ O(m2) bits in second round. Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 17 / 33

slide-78
SLIDE 78

1-sparse recovery: adaptive lower bound: finishing up

Lemma (Key lemma for k = 1)

I(z; v, x) b + 1 Suppose two rounds with m measurements each.

◮ O(m) bits learned in first round. ◮ O(m2) bits in second round. ◮ Hence m √log n. Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 17 / 33

slide-79
SLIDE 79

1-sparse recovery: adaptive lower bound: finishing up

Lemma (Key lemma for k = 1)

I(z; v, x) b + 1 Suppose two rounds with m measurements each.

◮ O(m) bits learned in first round. ◮ O(m2) bits in second round. ◮ Hence m √log n.

In general: Ω(R log1/R n) bits

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 17 / 33

slide-80
SLIDE 80

1-sparse recovery: adaptive lower bound: finishing up

Lemma (Key lemma for k = 1)

I(z; v, x) b + 1 Suppose two rounds with m measurements each.

◮ O(m) bits learned in first round. ◮ O(m2) bits in second round. ◮ Hence m √log n.

In general: Ω(R log1/R n) bits

◮ Ω(log log n) for unlimited R. Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 17 / 33

slide-81
SLIDE 81

Outline

1

Introduction

2

Analysis for k = 1

3

General k: lower bound

4

General k: upper bound

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 18 / 33

slide-82
SLIDE 82

Recall the k = 1 proof outline

Setting: x = ez + w for z ∼ p. p is posterior on z from previous measurements. Previous measurements had information content b := log n − H(p)

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 19 / 33

slide-83
SLIDE 83

Recall the k = 1 proof outline

Setting: x = ez + w for z ∼ p. p is posterior on z from previous measurements. Previous measurements had information content b := log n − H(p)

Lemma (Key lemma for k = 1)

I(z; v, x) b + 1

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 19 / 33

slide-84
SLIDE 84

Recall the k = 1 proof outline

Setting: x = ez + w for z ∼ p. p is posterior on z from previous measurements. Previous measurements had information content b := log n − H(p)

Lemma (Key lemma for k = 1)

I(z; v, x) b + 1 Question: How to extend this to k > 1?

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 19 / 33

slide-85
SLIDE 85

Extending to general k

Create k independent copies over domain N = nk.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 20 / 33

slide-86
SLIDE 86

Extending to general k

Create k independent copies over domain N = nk. Formally: x = k

i=1 eni+Zi + w for Z ∈ [n]k, Z ∼ p.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 20 / 33

slide-87
SLIDE 87

Extending to general k

Create k independent copies over domain N = nk. Formally: x = k

i=1 eni+Zi + w for Z ∈ [n]k, Z ∼ p.

p is posterior from previous measurements.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 20 / 33

slide-88
SLIDE 88

Extending to general k

Create k independent copies over domain N = nk. Formally: x = k

i=1 eni+Zi + w for Z ∈ [n]k, Z ∼ p.

p is posterior from previous measurements. Previous measurements have information content b := k log n − H(p)

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 20 / 33

slide-89
SLIDE 89

Extending to general k

Create k independent copies over domain N = nk. Formally: x = k

i=1 eni+Zi + w for Z ∈ [n]k, Z ∼ p.

p is posterior from previous measurements. Previous measurements have information content b := k log n − H(p)

Lemma (Key lemma for general k)

I(Z; v, x) b + 1 ????

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 20 / 33

slide-90
SLIDE 90

What lemma do we want for general k?

I(Z; v, x) b + 1

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 21 / 33

slide-91
SLIDE 91

What lemma do we want for general k?

I(Z; v, x) b + 1

◮ True but too weak: would get Ω(√k log n) not k√log n. Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 21 / 33

slide-92
SLIDE 92

What lemma do we want for general k?

I(Z; v, x) b + 1

◮ True but too weak: would get Ω(√k log n) not k√log n.

I(Z; v, x) b/k + 1

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 21 / 33

slide-93
SLIDE 93

What lemma do we want for general k?

I(Z; v, x) b + 1

◮ True but too weak: would get Ω(√k log n) not k√log n.

I(Z; v, x) b/k + 1

◮ Strong but false: if algorithm does 1-sparse recovery on first block, it

really can learn Θ(b + 1) bits.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 21 / 33

slide-94
SLIDE 94

What lemma do we want for general k?

I(Z; v, x) b + 1

◮ True but too weak: would get Ω(√k log n) not k√log n.

I(Z; v, x) b/k + 1

◮ Strong but false: if algorithm does 1-sparse recovery on first block, it

really can learn Θ(b + 1) bits.

◮ But the learned bits are only about that first block. Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 21 / 33

slide-95
SLIDE 95

What lemma do we want for general k?

I(Z; v, x) b + 1

◮ True but too weak: would get Ω(√k log n) not k√log n.

I(Z; v, x) b/k + 1

◮ Strong but false: if algorithm does 1-sparse recovery on first block, it

really can learn Θ(b + 1) bits.

◮ But the learned bits are only about that first block.

I(ZW ; v, x) b/k + 1 for |W | > 0.99k.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 21 / 33

slide-96
SLIDE 96

What lemma do we want for general k?

I(Z; v, x) b + 1

◮ True but too weak: would get Ω(√k log n) not k√log n.

I(Z; v, x) b/k + 1

◮ Strong but false: if algorithm does 1-sparse recovery on first block, it

really can learn Θ(b + 1) bits.

◮ But the learned bits are only about that first block.

I(ZW ; v, x) b/k + 1 for |W | > 0.99k.

◮ Strong enough, at least for constant R. Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 21 / 33

slide-97
SLIDE 97

What lemma do we want for general k?

I(Z; v, x) b + 1

◮ True but too weak: would get Ω(√k log n) not k√log n.

I(Z; v, x) b/k + 1

◮ Strong but false: if algorithm does 1-sparse recovery on first block, it

really can learn Θ(b + 1) bits.

◮ But the learned bits are only about that first block.

I(ZW ; v, x) b/k + 1 for |W | > 0.99k.

◮ Strong enough, at least for constant R. ◮ True for product distributions p... Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 21 / 33

slide-98
SLIDE 98

What lemma do we want for general k?

I(Z; v, x) b + 1

◮ True but too weak: would get Ω(√k log n) not k√log n.

I(Z; v, x) b/k + 1

◮ Strong but false: if algorithm does 1-sparse recovery on first block, it

really can learn Θ(b + 1) bits.

◮ But the learned bits are only about that first block.

I(ZW ; v, x) b/k + 1 for |W | > 0.99k.

◮ Strong enough, at least for constant R. ◮ True for product distributions p... ◮ but correlated p can make this false. Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 21 / 33

slide-99
SLIDE 99

What lemma do we want for general k?

I(Z; v, x) b + 1

◮ True but too weak: would get Ω(√k log n) not k√log n.

I(Z; v, x) b/k + 1

◮ Strong but false: if algorithm does 1-sparse recovery on first block, it

really can learn Θ(b + 1) bits.

◮ But the learned bits are only about that first block.

I(ZW ; v, x) b/k + 1 for |W | > 0.99k.

◮ Strong enough, at least for constant R. ◮ True for product distributions p... ◮ but correlated p can make this false.

I(ZW ; v, x) b/k + log k

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 21 / 33

slide-100
SLIDE 100

What lemma do we want for general k?

I(Z; v, x) b + 1

◮ True but too weak: would get Ω(√k log n) not k√log n.

I(Z; v, x) b/k + 1

◮ Strong but false: if algorithm does 1-sparse recovery on first block, it

really can learn Θ(b + 1) bits.

◮ But the learned bits are only about that first block.

I(ZW ; v, x) b/k + 1 for |W | > 0.99k.

◮ Strong enough, at least for constant R. ◮ True for product distributions p... ◮ but correlated p can make this false.

I(ZW ; v, x) b/k + log k

◮ True! Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 21 / 33

slide-101
SLIDE 101

What lemma do we want for general k?

I(Z; v, x) b + 1

◮ True but too weak: would get Ω(√k log n) not k√log n.

I(Z; v, x) b/k + 1

◮ Strong but false: if algorithm does 1-sparse recovery on first block, it

really can learn Θ(b + 1) bits.

◮ But the learned bits are only about that first block.

I(ZW ; v, x) b/k + 1 for |W | > 0.99k.

◮ Strong enough, at least for constant R. ◮ True for product distributions p... ◮ but correlated p can make this false.

I(ZW ; v, x) b/k + log k

◮ True! ◮ Strong enough if b > k log k after the first round. Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 21 / 33

slide-102
SLIDE 102

Approach

I(ZW ; v, x) b/k + log k.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 22 / 33

slide-103
SLIDE 103

Approach

I(ZW ; v, x) b/k + log k. Data processing and Shannon-Hartley: I(ZW ; v, x) I(

  • i∈W

vZi; (

  • i∈W

vZi) + v, w) 1 2 log(1 + SNR)

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 22 / 33

slide-104
SLIDE 104

Approach

I(ZW ; v, x) b/k + log k. Data processing and Shannon-Hartley: I(ZW ; v, x) I(

  • i∈W

vZi; (

  • i∈W

vZi) + v, w) 1 2 log(1 + SNR) where SNR := EZ∼p[(

i∈W vZi)2]

v2

2/n

k EZ∼p[

i∈W v2 Zi]

v2

2/n

k max

i∈W SNR(i).

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 22 / 33

slide-105
SLIDE 105

Approach

I(ZW ; v, x) b/k + log k. Data processing and Shannon-Hartley: I(ZW ; v, x) I(

  • i∈W

vZi; (

  • i∈W

vZi) + v, w) 1 2 log(1 + SNR) where SNR := EZ∼p[(

i∈W vZi)2]

v2

2/n

k EZ∼p[

i∈W v2 Zi]

v2

2/n

k max

i∈W SNR(i).

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 22 / 33

slide-106
SLIDE 106

Approach

I(ZW ; v, x) b/k + log k. Data processing and Shannon-Hartley: I(ZW ; v, x) I(

  • i∈W

vZi; (

  • i∈W

vZi) + v, w) 1 2 log(1 + SNR) where SNR := EZ∼p[(

i∈W vZi)2]

v2

2/n

k EZ∼p[

i∈W v2 Zi]

v2

2/n

k max

i∈W SNR(i).

So we just need max

i∈W log(1 + SNR(i)) b/k.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 22 / 33

slide-107
SLIDE 107

Approach

k Level set J (random) Would like to find a set W such that: max

i∈W log(1 + SNR(i)) b/k.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 23 / 33

slide-108
SLIDE 108

Approach

k Level set J (random) Would like to find a set W such that: max

i∈W log(1 + SNR(i)) b/k.

What’s actually true: E

i E J [log(1 + (SNR(i) | J))] b/k

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 23 / 33

slide-109
SLIDE 109

Approach

k Level set J (random) Would like to find a set W such that: max

i∈W log(1 + SNR(i)) b/k.

What’s actually true: E

i E J [log(1 + (SNR(i) | J))] b/k

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 23 / 33

slide-110
SLIDE 110

Approach

k Level set J (random) Would like to find a set W such that: max

i∈W log(1 + SNR(i)) b/k.

What’s actually true: E

i E J [log(1 + (SNR(i) | J))] b/k

Find W = W (J) so that max

i∈W E J log(1 + (SNR(i)|J)) b/k

and |W | 0.99k with 99% probability.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 23 / 33

slide-111
SLIDE 111

Goal for general k

Lemma (Key lemma for general k)

One can choose a set W = W (J) ⊂ [k] of expected size 0.99k so that I(ZW ; Ax) m(b/k + log k) + (b + k) for any A ∈ Rm×N.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 24 / 33

slide-112
SLIDE 112

Goal for general k

Lemma (Key lemma for general k)

One can choose a set W = W (J) ⊂ [k] of expected size 0.99k so that I(ZW ; Ax) m(b/k + log k) + (b + k) for any A ∈ Rm×N. Recall k = 1 approach: I(Z; Ax) = I(Z; Ax | J) + H(J) m · E

J

1 2 log(1 + (SNR | J))

  • + O(b + 1)

m(b + 1) + (b + 1)

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 24 / 33

slide-113
SLIDE 113

Goal for general k

I(ZW ; Ax) m(b/k + log k) + (b + k) k = 1 General k I(Z; Ax) I(ZW ; Ax) = I(Z; Ax | J) + H(J) = I(ZW ; Ax | J) + H(J) m · E

J [log(SNR | J)]

m · E

J

  • log(SNR(
  • i∈W

Zi) | J)

  • + b + 1

+ b + k m · E

J

  • log(k · max

i∈W SNR(Zi) | J)

  • m(b + 1) + (b + 1)

m · (b/k + log k) + (b + k)

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 25 / 33

slide-114
SLIDE 114

Goal for general k

I(ZW ; Ax) m(b/k + log k) + (b + k) k = 1 General k I(Z; Ax) I(ZW ; Ax) = I(Z; Ax | J) + H(J) = I(ZW ; Ax | J) + H(J) m · E

J [log(SNR | J)]

m · E

J

  • log(SNR(
  • i∈W

Zi) | J)

  • + b + 1

+ b + k m · E

J

  • log(k · max

i∈W SNR(Zi) | J)

  • m(b + 1) + (b + 1)

m · (b/k + log k) + (b + k)

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 25 / 33

slide-115
SLIDE 115

Goal for general k

I(ZW ; Ax) m(b/k + log k) + (b + k) k = 1 General k I(Z; Ax) I(ZW ; Ax) = I(Z; Ax | J) + H(J) = I(ZW ; Ax | J) + H(J) m · E

J [log(SNR | J)]

m · E

J

  • log(SNR(
  • i∈W

Zi) | J)

  • + b + 1

+ b + k m · E

J

  • log(k · max

i∈W SNR(Zi) | J)

  • m(b + 1) + (b + 1)

m · (b/k + log k) + (b + k)

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 25 / 33

slide-116
SLIDE 116

Goal for general k

I(ZW ; Ax) m(b/k + log k) + (b + k) k = 1 General k I(Z; Ax) I(ZW ; Ax) = I(Z; Ax | J) + H(J) = I(ZW ; Ax | J) + H(J) m · E

J [log(SNR | J)]

m · E

J

  • log(SNR(
  • i∈W

Zi) | J)

  • + b + 1

+ b + k m · E

J

  • log(k · max

i∈W SNR(Zi) | J)

  • m(b + 1) + (b + 1)

m · (b/k + log k) + (b + k)

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 25 / 33

slide-117
SLIDE 117

Goal for general k

I(ZW ; Ax) m(b/k + log k) + (b + k) k = 1 General k I(Z; Ax) I(ZW ; Ax) = I(Z; Ax | J) + H(J) = I(ZW ; Ax | J) + H(J) m · E

J [log(SNR | J)]

m · E

J

  • log(SNR(
  • i∈W

Zi) | J)

  • + b + 1

+ b + k m · E

J

  • log(k · max

i∈W SNR(Zi) | J)

  • m(b + 1) + (b + 1)

m · (b/k + log k) + (b + k)

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 25 / 33

slide-118
SLIDE 118

Goal for general k

I(ZW ; Ax) m(b/k + log k) + (b + k) k = 1 General k I(Z; Ax) I(ZW ; Ax) = I(Z; Ax | J) + H(J) = I(ZW ; Ax | J) + H(J) m · E

J [log(SNR | J)]

m · E

J

  • log(SNR(
  • i∈W

Zi) | J)

  • + b + 1

+ b + k m · E

J

  • log(k · max

i∈W SNR(Zi) | J)

  • m(b + 1) + (b + 1)

m · (b/k + log k) + (b + k)

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 25 / 33

slide-119
SLIDE 119

Wrapping up the lower bound for R = 2

Suppose m > k log k measurements per round.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 26 / 33

slide-120
SLIDE 120

Wrapping up the lower bound for R = 2

Suppose m > k log k measurements per round. First round is nonadaptive: learn b = O(m) bits.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 26 / 33

slide-121
SLIDE 121

Wrapping up the lower bound for R = 2

Suppose m > k log k measurements per round. First round is nonadaptive: learn b = O(m) bits. Second round, learn m(b/k + log k) + (b + k) = O(m2/k) bits.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 26 / 33

slide-122
SLIDE 122

Wrapping up the lower bound for R = 2

Suppose m > k log k measurements per round. First round is nonadaptive: learn b = O(m) bits. Second round, learn m(b/k + log k) + (b + k) = O(m2/k) bits. But need to learn |W | log n ≈ k log n bits.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 26 / 33

slide-123
SLIDE 123

Wrapping up the lower bound for R = 2

Suppose m > k log k measurements per round. First round is nonadaptive: learn b = O(m) bits. Second round, learn m(b/k + log k) + (b + k) = O(m2/k) bits. But need to learn |W | log n ≈ k log n bits. Hence m k√log n (if this is more than k log k).

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 26 / 33

slide-124
SLIDE 124

Wrapping up the lower bound for R = 2

Suppose m > k log k measurements per round. First round is nonadaptive: learn b = O(m) bits. Second round, learn m(b/k + log k) + (b + k) = O(m2/k) bits. But need to learn |W | log n ≈ k log n bits. Hence m k√log n (if this is more than k log k). Open questions:

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 26 / 33

slide-125
SLIDE 125

Wrapping up the lower bound for R = 2

Suppose m > k log k measurements per round. First round is nonadaptive: learn b = O(m) bits. Second round, learn m(b/k + log k) + (b + k) = O(m2/k) bits. But need to learn |W | log n ≈ k log n bits. Hence m k√log n (if this is more than k log k). Open questions:

◮ Less restriction on k? Conjecture:

I(ZW ; Ax | ZW ) b/k + 1

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 26 / 33

slide-126
SLIDE 126

Wrapping up the lower bound for R = 2

Suppose m > k log k measurements per round. First round is nonadaptive: learn b = O(m) bits. Second round, learn m(b/k + log k) + (b + k) = O(m2/k) bits. But need to learn |W | log n ≈ k log n bits. Hence m k√log n (if this is more than k log k). Open questions:

◮ Less restriction on k? Conjecture:

I(ZW ; Ax | ZW ) b/k + 1

◮ Better dependence on R? Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 26 / 33

slide-127
SLIDE 127

Outline

1

Introduction

2

Analysis for k = 1

3

General k: lower bound

4

General k: upper bound

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 27 / 33

slide-128
SLIDE 128

Standard sparse recovery approach

We have an optimal adaptive 1-sparse recovery algorithm.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 28 / 33

slide-129
SLIDE 129

Standard sparse recovery approach

We have an optimal adaptive 1-sparse recovery algorithm. Standard technique:

1

Throw coordinates into buckets.

2

1-sparse recovery within each bucket.

3

Clean up mistakes.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 28 / 33

slide-130
SLIDE 130

Standard sparse recovery approach

We have an optimal adaptive 1-sparse recovery algorithm. Standard technique:

1

Throw coordinates into buckets.

2

1-sparse recovery within each bucket.

3

Clean up mistakes.

Problem: surrounding steps add rounds.

◮ [IPW ’11]: cleanup is recursive, multiplying rounds by O(log∗ k). ◮ [NSZW ’18]: 1 round setup, 2 rounds cleanup. Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 28 / 33

slide-131
SLIDE 131

Standard sparse recovery approach

We have an optimal adaptive 1-sparse recovery algorithm. Standard technique:

1

Throw coordinates into buckets.

2

1-sparse recovery within each bucket.

3

Clean up mistakes.

Problem: surrounding steps add rounds.

◮ [IPW ’11]: cleanup is recursive, multiplying rounds by O(log∗ k). ◮ [NSZW ’18]: 1 round setup, 2 rounds cleanup.

Our approach: avoid reduction to k = 1.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 28 / 33

slide-132
SLIDE 132

Standard sparse recovery approach

We have an optimal adaptive 1-sparse recovery algorithm. Standard technique:

1

Throw coordinates into buckets.

2

1-sparse recovery within each bucket.

3

Clean up mistakes.

Problem: surrounding steps add rounds.

◮ [IPW ’11]: cleanup is recursive, multiplying rounds by O(log∗ k). ◮ [NSZW ’18]: 1 round setup, 2 rounds cleanup.

Our approach: avoid reduction to k = 1.

◮ Instead, reduce to C-approximate k-sparse recovery for C ≫ 1. Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 28 / 33

slide-133
SLIDE 133

Standard sparse recovery approach

We have an optimal adaptive 1-sparse recovery algorithm. Standard technique:

1

Throw coordinates into buckets.

2

1-sparse recovery within each bucket.

3

Clean up mistakes.

Problem: surrounding steps add rounds.

◮ [IPW ’11]: cleanup is recursive, multiplying rounds by O(log∗ k). ◮ [NSZW ’18]: 1 round setup, 2 rounds cleanup.

Our approach: avoid reduction to k = 1.

◮ Instead, reduce to C-approximate k-sparse recovery for C ≫ 1. ◮ This is solvable nonadaptively in O(k logC(n/k) · log∗ k)

  • measurements. [Price-Woodruff ’12]

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 28 / 33

slide-134
SLIDE 134

Basic approach: R = 2

1 Throw the coordinates into B = k · 2

√log n buckets, and nonadaptively

apply k-sparse O(1)-approximate recovery.

◮ k log(B/k) = k√log n measurements. Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 29 / 33

slide-135
SLIDE 135

Basic approach: R = 2

1 Throw the coordinates into B = k · 2

√log n buckets, and nonadaptively

apply k-sparse O(1)-approximate recovery.

◮ k log(B/k) = k√log n measurements. 2 Apply k-sparse 2

√log n-approximate recovery to the preimage.

◮ k logC n = k√log n measurements. Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 29 / 33

slide-136
SLIDE 136

Basic approach: R = 2

1 Throw the coordinates into B = k · 2

√log n buckets, and nonadaptively

apply k-sparse O(1)-approximate recovery.

◮ k log(B/k) = k√log n measurements. 2 Apply k-sparse 2

√log n-approximate recovery to the preimage.

◮ k logC n = k√log n measurements.

Key problem: can’t miss anything important in the first round.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 29 / 33

slide-137
SLIDE 137

Basic approach: R = 2

1 Throw the coordinates into B = k · 2

√log n buckets, and nonadaptively

apply k-sparse O(1)-approximate recovery.

◮ k log(B/k) = k√log n measurements. 2 Apply k-sparse 2

√log n-approximate recovery to the preimage.

◮ k logC n = k√log n measurements.

Key problem: can’t miss anything important in the first round.

◮ There will be collisions. Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 29 / 33

slide-138
SLIDE 138

Basic approach: R = 2

1 Throw the coordinates into B = k · 2

√log n buckets, and nonadaptively

apply k-sparse O(1)-approximate recovery.

◮ k log(B/k) = k√log n measurements. 2 Apply k-sparse 2

√log n-approximate recovery to the preimage.

◮ k logC n = k√log n measurements.

Key problem: can’t miss anything important in the first round.

◮ There will be collisions. ◮ Yet if x has no noise, must find every entry. Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 29 / 33

slide-139
SLIDE 139

Basic approach: R = 2

1 Throw the coordinates into B = k · 2

√log n buckets, and nonadaptively

apply k-sparse O(1)-approximate recovery.

◮ k log(B/k) = k√log n measurements. 2 Apply k-sparse 2

√log n-approximate recovery to the preimage.

◮ k logC n = k√log n measurements.

Key problem: can’t miss anything important in the first round.

◮ There will be collisions. ◮ Yet if x has no noise, must find every entry.

Solution: triple Gaussian hashing.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 29 / 33

slide-140
SLIDE 140

Hashing

x For intuition, consider x being k-sparse binary + Gaussian with norm 1.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 30 / 33

slide-141
SLIDE 141

Hashing

x For intuition, consider x being k-sparse binary + Gaussian with norm 1.

◮ Successful recovery must find all but O(1) binary entries of x. Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 30 / 33

slide-142
SLIDE 142

Hashing

x For intuition, consider x being k-sparse binary + Gaussian with norm 1.

◮ Successful recovery must find all but O(1) binary entries of x.

Given partition h : [n] → [B], how to condense x ∈ Rn into y ∈ RB?

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 30 / 33

slide-143
SLIDE 143

Hashing

x For intuition, consider x being k-sparse binary + Gaussian with norm 1.

◮ Successful recovery must find all but O(1) binary entries of x.

Given partition h : [n] → [B], how to condense x ∈ Rn into y ∈ RB?

◮ Goal: preimage of k-sparse recovery on y includes large entries in x. Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 30 / 33

slide-144
SLIDE 144

Hashing

x For intuition, consider x being k-sparse binary + Gaussian with norm 1.

◮ Successful recovery must find all but O(1) binary entries of x.

Given partition h : [n] → [B], how to condense x ∈ Rn into y ∈ RB?

◮ Goal: preimage of k-sparse recovery on y includes large entries in x.

Random signs: s : [n] → {±1} and yu =

  • i:h(i)=u

xi · s(i). Without noise With noise

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 30 / 33

slide-145
SLIDE 145

Hashing

x For intuition, consider x being k-sparse binary + Gaussian with norm 1.

◮ Successful recovery must find all but O(1) binary entries of x.

Given partition h : [n] → [B], how to condense x ∈ Rn into y ∈ RB?

◮ Goal: preimage of k-sparse recovery on y includes large entries in x.

Random signs: s : [n] → {±1} and yu =

  • i:h(i)=u

xi · s(i). Without noise With noise

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 30 / 33

slide-146
SLIDE 146

Hashing

x For intuition, consider x being k-sparse binary + Gaussian with norm 1.

◮ Successful recovery must find all but O(1) binary entries of x.

Given partition h : [n] → [B], how to condense x ∈ Rn into y ∈ RB?

◮ Goal: preimage of k-sparse recovery on y includes large entries in x.

Random signs: s : [n] → {±1} and yu =

  • i:h(i)=u

xi · s(i). Without noise With noise

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 30 / 33

slide-147
SLIDE 147

Gaussian hashing

x Random signs: s : [n] → {±1} and yu =

  • i:h(i)=u

xi · s(i). Without noise With noise

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 31 / 33

slide-148
SLIDE 148

Gaussian hashing

x Random signs: s : [n] → {±1} and yu =

  • i:h(i)=u

xi · s(i). Without noise With noise Gaussian hashing: g ∼ N(In) and yu =

  • i:h(i)=u

xi · g(i). Without noise With noise

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 31 / 33

slide-149
SLIDE 149

Gaussian hashing

x Random signs: s : [n] → {±1} and yu =

  • i:h(i)=u

xi · s(i). Without noise With noise Gaussian hashing: g ∼ N(In) and yu =

  • i:h(i)=u

xi · g(i). Without noise With noise

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 31 / 33

slide-150
SLIDE 150

Gaussian hashing

x Random signs: s : [n] → {±1} and yu =

  • i:h(i)=u

xi · s(i). Without noise With noise Gaussian hashing: g ∼ N(In) and yu =

  • i:h(i)=u

xi · g(i). Without noise With noise

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 31 / 33

slide-151
SLIDE 151

Triple Gaussian hashing

x Triple Gaussian hashing: g1, g2, g3 ∼ N(0, In); yj

u =

  • i:h(i)=u

xi · gj(i). Try 1 Try 2 Try 3

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 32 / 33

slide-152
SLIDE 152

Triple Gaussian hashing

x Triple Gaussian hashing: g1, g2, g3 ∼ N(0, In); yj

u =

  • i:h(i)=u

xi · gj(i). Try 1 Try 2 Try 3

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 32 / 33

slide-153
SLIDE 153

Triple Gaussian hashing

x Triple Gaussian hashing: g1, g2, g3 ∼ N(0, In); yj

u =

  • i:h(i)=u

xi · gj(i). Try 1 Try 2 Try 3

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 32 / 33

slide-154
SLIDE 154

Triple Gaussian hashing

x Triple Gaussian hashing: g1, g2, g3 ∼ N(0, In); yj

u =

  • i:h(i)=u

xi · gj(i). Try 1 Try 2 Try 3

◮ Take union of three independent sparse recovery attempts. Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 32 / 33

slide-155
SLIDE 155

Triple Gaussian hashing

x Triple Gaussian hashing: g1, g2, g3 ∼ N(0, In); yj

u =

  • i:h(i)=u

xi · gj(i). Try 1 Try 2 Try 3

◮ Take union of three independent sparse recovery attempts. ◮ Expected false negatives are O(noise), so can be skipped. Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 32 / 33

slide-156
SLIDE 156

Triple Gaussian hashing

x Triple Gaussian hashing: g1, g2, g3 ∼ N(0, In); yj

u =

  • i:h(i)=u

xi · gj(i). Try 1 Try 2 Try 3

◮ Take union of three independent sparse recovery attempts. ◮ Expected false negatives are O(noise), so can be skipped.

Avoids the cleanup rounds, getting O(k log1/R n · log∗ k) measurements.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 32 / 33

slide-157
SLIDE 157

Results

Previously: k + log1/R n m k · log1/(R−3) n

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 33 / 33

slide-158
SLIDE 158

Results

Previously: k + log1/R n m k · log1/(R−3) n Now: k · log1/R n m k · log1/R n · log∗ k where the lower bound applies if this is above k log k.

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 33 / 33

slide-159
SLIDE 159

Results

Previously: k + log1/R n m k · log1/(R−3) n Now: k · log1/R n m k · log1/R n · log∗ k where the lower bound applies if this is above k log k. Biggest question:

◮ Are ω(k) measurements necessary for unlimited R? Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 33 / 33

slide-160
SLIDE 160

Results

Previously: k + log1/R n m k · log1/(R−3) n Now: k · log1/R n m k · log1/R n · log∗ k where the lower bound applies if this is above k log k. Biggest question:

◮ Are ω(k) measurements necessary for unlimited R?

Thank You

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 33 / 33

slide-161
SLIDE 161

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 34 / 33

slide-162
SLIDE 162

Akshay Kamath, Eric Price (UT Austin) Adaptive Sparse Recovery with Limited Adaptivity 35 / 33