Gaussian Approximation of Quantization Error for Inference from - - PowerPoint PPT Presentation

gaussian approximation of quantization error for
SMART_READER_LITE
LIVE PREVIEW

Gaussian Approximation of Quantization Error for Inference from - - PowerPoint PPT Presentation

Gaussian Approximation of Quantization Error for Inference from Compressed Data Alon Kipnis (Stanford) Galen Reeves (Duke) ISIT, July 2019 Table of Contents Introduction Motivation Contribution Main results Non Asymptotic Asymptotic


slide-1
SLIDE 1

Gaussian Approximation of Quantization Error for Inference from Compressed Data

Alon Kipnis (Stanford) Galen Reeves (Duke) ISIT, July 2019

slide-2
SLIDE 2

Table of Contents

Introduction Motivation Contribution Main results Non Asymptotic Asymptotic Examples / Applications Standard Source Coding Quantized Compressed Sensing

2 / 16

slide-3
SLIDE 3

Motivation

Inference from Compressed Data

X ∼ PX|θ ˆ θ

data inference

3 / 16

slide-4
SLIDE 4

Motivation

Inference from Compressed Data

X ∼ PX|θ ˆ θ

data inference

Y

compression (bit limitation)

3 / 16

slide-5
SLIDE 5

Motivation

Inference from Compressed Data

X ∼ PX|θ ˆ θ

data inference

Y

compression (bit limitation)

◮ Indirect rate-distortion [Dobrushin & Tsybakov ’62], [Berger ’71] ◮ Quantized compressed sensing [Kamilov et. al. ’12], [Xu et. al. ’14], [Kipnis et. al. ’17, ’18] ◮ Estimation under communication constraints [Han ’87], [Zhang & Berger ’88], [Duchi et. al. ’17], [Duchi & Kipnis ’17], [Barnes et. al. ’18], [Han

  • et. al. ’18]

◮ Task-oriented quantization [Kassam ’77], [Picimbono & Duvaut ’88], [Gersho ’96], [Misra et. al. ’08], [Shlezinger et. al. ’18]

3 / 16

slide-6
SLIDE 6

Motivation

Inference from Compressed Data

X ∼ PX|θ ˆ θ

data inference

Y

compression (bit limitation)

◮ Indirect rate-distortion [Dobrushin & Tsybakov ’62], [Berger ’71] ◮ Quantized compressed sensing [Kamilov et. al. ’12], [Xu et. al. ’14], [Kipnis et. al. ’17, ’18] ◮ Estimation under communication constraints [Han ’87], [Zhang & Berger ’88], [Duchi et. al. ’17], [Duchi & Kipnis ’17], [Barnes et. al. ’18], [Han

  • et. al. ’18]

◮ Task-oriented quantization [Kassam ’77], [Picimbono & Duvaut ’88], [Gersho ’96], [Misra et. al. ’08], [Shlezinger et. al. ’18]

Challenge: combining estimation theory and quantization

3 / 16

slide-7
SLIDE 7

Lossy Compression vs AWGN Channel

X

PX|θ

θ Enc Dec Y Z + N(0, 1

snrI)

{1, . . . , 2nR}

4 / 16

slide-8
SLIDE 8

Lossy Compression vs AWGN Channel

X

PX|θ

θ Enc Dec Y Z + N(0, 1

snrI)

{1, . . . , 2nR}

◮ Plenty of pitfalls/non-rigorous work [Gray ] ◮ Some rigorous “high bit resolution” results [Lee & Neuhoff ’96], [Viswanathan & Zamir ’01], [Marco & Neuhoff ’05]

4 / 16

slide-9
SLIDE 9

Lossy Compression vs AWGN Channel

X

PX|θ

θ Enc Dec Y Z + N(0, 1

snrI)

{1, . . . , 2nR}

◮ Plenty of pitfalls/non-rigorous work [Gray ] ◮ Some rigorous “high bit resolution” results [Lee & Neuhoff ’96], [Viswanathan & Zamir ’01], [Marco & Neuhoff ’05]

This Talk: If X is encoded using a random spherical code, then Wass2 (Y , Z | X) ≈ const, snr = 22R − 1

4 / 16

slide-10
SLIDE 10

Geometric Interpretation of Gaussian Source Coding

[Sakrison ’68], [Wyner ’68]

5 / 16

slide-11
SLIDE 11

Geometric Interpretation of Gaussian Source Coding

[Sakrison ’68], [Wyner ’68]

X

√n

input sphere

5 / 16

slide-12
SLIDE 12

Geometric Interpretation of Gaussian Source Coding

[Sakrison ’68], [Wyner ’68]

X

√n

input sphere X r √ n ¯ Y

α

representation sphere sin α → 2−R

5 / 16

slide-13
SLIDE 13

Geometric Interpretation of Gaussian Source Coding

[Sakrison ’68], [Wyner ’68]

X

√n

input sphere X r √ n ¯ Y

α

representation sphere sin α → 2−R error sphere

5 / 16

slide-14
SLIDE 14

Geometric Interpretation of Gaussian Source Coding

[Sakrison ’68], [Wyner ’68]

X

√n

input sphere X r √ n ¯ Y

α

representation sphere sin α → 2−R error sphere This talk: √n ρ √ n X Y

α

5 / 16

slide-15
SLIDE 15

Overview of Contributions

X + Z N(0, 1

snrI)

22R − 1 = snr √n X Y

6 / 16

slide-16
SLIDE 16

Overview of Contributions

X + Z N(0, 1

snrI)

22R − 1 = snr √n X Y

◮ Strong equivalence between quantization error using rate R

random spherical coding and AWGN with SNR 22R − 1

6 / 16

slide-17
SLIDE 17

Overview of Contributions

X + Z N(0, 1

snrI)

22R − 1 = snr √n X Y

◮ Strong equivalence between quantization error using rate R

random spherical coding and AWGN with SNR 22R − 1

◮ Applications to inference from compressed data

6 / 16

slide-18
SLIDE 18

Table of Contents

Introduction Motivation Contribution Main results Non Asymptotic Asymptotic Examples / Applications Standard Source Coding Quantized Compressed Sensing

7 / 16

slide-19
SLIDE 19

Main Result (Non Asymptotic)

Gaussian Approximation of Quantization Error

X

PX|θ

θ

rate R spherical code

Y Z + W ∼ N(0, σ2I)

Theorem

For PX with finite second moments, ρ = E [X]

  • n(1 − 2−2R)

, σ = E [X]

  • n(22R − 1)

= ρ2−R, we have Wass2

2(Y , Z | X) ≤ var(X) + 2σ2 + CRE [X]2 log2 n

n2

8 / 16

slide-20
SLIDE 20

Wasserstein Distance and Lipschitz Continuity

Definition (quadratic Wasserstein Distance:)

Wass2(Y , Z) inf

PY ,Z E [Y − Z2] ,

(PY , PZ are fixed) (a.k.a. Kantorovitch, Kantorovich-Rubinstein, “transportation”, ρ-bar, “earth movers”, Gini, Frechet, Vallender, Mallows...)

9 / 16

slide-21
SLIDE 21

Wasserstein Distance and Lipschitz Continuity

Definition (quadratic Wasserstein Distance:)

Wass2(Y , Z) inf

PY ,Z E [Y − Z2] ,

(PY , PZ are fixed) (a.k.a. Kantorovitch, Kantorovich-Rubinstein, “transportation”, ρ-bar, “earth movers”, Gini, Frechet, Vallender, Mallows...)

Fact

For any L-Lipschitz f:

  • E
  • θ − f(Y )2

2

  • E
  • θ − f(Z)2

2

  • ≤ L Wass2(Y , Z | X)

9 / 16

slide-22
SLIDE 22

Main Result (Asymptotic)

Asymptotic Squared Error

X

PX|θ

θ θ ∈ Rdn

rate R spherical code

Y Z + W ∼ N(0, σ2I)

Corollary

If 1 dn E

  • θ − ˆ

θn(Z)2 = M(snr) + o(1), then 1 dn E

  • θ − ˆ

θn(Y )

  • 2

= M(22R − 1) + o(1), provided:

◮ var(X) = O(1) ◮ Lip(ˆ

θn) = o(√dn)

10 / 16

slide-23
SLIDE 23

Table of Contents

Introduction Motivation Contribution Main results Non Asymptotic Asymptotic Examples / Applications Standard Source Coding Quantized Compressed Sensing

11 / 16

slide-24
SLIDE 24

Examples / Applications

X

PX|θ

θ Enc Dec ˆ θ {1, . . . , 2nR}

◮ Standard source coding: X = θ ◮ Quantized compressed sensing: X = Aθ + W

Not in this talk...

◮ Parametric estimation under bit constraints ◮ Optimization with gradient compression ◮ Data compression in latent space using a generative model

12 / 16

slide-25
SLIDE 25

Example I: Standard Source Coding

X = θ, E

  • θ2

1

  • = 1

θ Pθ

iid

∼ Enc Dec ˆ θ R bits/symbol

13 / 16

slide-26
SLIDE 26

Example I: Standard Source Coding

X = θ, E

  • θ2

1

  • = 1

θ Pθ

iid

∼ Enc Dec ˆ θ R bits/symbol + Z W/√snr

13 / 16

slide-27
SLIDE 27

Example I: Standard Source Coding

X = θ, E

  • θ2

1

  • = 1

θ Pθ

iid

∼ Enc Dec ˆ θ R bits/symbol + Z W/√snr 1) Estimator: ˆ θ(z) = E [θ1|Z1 = z] 2) MSE function: M(snr) = mmse(θ1 | Z1)

13 / 16

slide-28
SLIDE 28

Example I: Standard Source Coding

X = θ, E

  • θ2

1

  • = 1

θ Pθ

iid

∼ Enc Dec ˆ θ R bits/symbol + Z W/√snr 1) Estimator: ˆ θ(z) = E [θ1|Z1 = z] 2) MSE function: M(snr) = mmse(θ1 | Z1)

Corollary

Dsp(R) = mmse

  • θ1 | θ1 + W/
  • 22R − 1
  • is achievable with random spherical coding

13 / 16

slide-29
SLIDE 29

Example I: Standard Source Coding

X = θ, E

  • θ2

1

  • = 1

θ Pθ

iid

∼ Enc Dec ˆ θ R bits/symbol + Z W/√snr 1) Estimator: ˆ θ(z) = E [θ1|Z1 = z] 2) MSE function: M(snr) = mmse(θ1 | Z1)

Corollary

Dsp(R) = mmse

  • θ1 | θ1 + W/
  • 22R − 1
  • is achievable with random spherical coding

Note: Dsp(R) ≤ DGauss(R) = 2−2R

◮ Compare to [Sakrison ’68], [Lapidoth ’97]

13 / 16

slide-30
SLIDE 30

Standard Source Coding (cont’d)

Illustration: Equiprobable Binary

Pθ = Unif ({−1, 1}) , i = 1, . . . , n 1 2 1 R MSE Dsp(R) DGauss(R) DShannon(R)

14 / 16

slide-31
SLIDE 31

Example II: Quantized Compressed Sensing

X = Aθ + ǫW → {1, . . . , 2nR} → ˆ θ, A ∈ Rn×dn

n/dn→δ>0

15 / 16

slide-32
SLIDE 32

Example II: Quantized Compressed Sensing

X = Aθ + ǫW → {1, . . . , 2nR} → ˆ θ, A ∈ Rn×dn

n/dn→δ>0

Z = Aθ +

  • ǫ2 + σ2W ′,

σ2 = 1 snr E

  • X2

n

15 / 16

slide-33
SLIDE 33

Example II: Quantized Compressed Sensing

X = Aθ + ǫW → {1, . . . , 2nR} → ˆ θ, A ∈ Rn×dn

n/dn→δ>0

Z = Aθ +

  • ǫ2 + σ2W ′,

σ2 = 1 snr E

  • X2

n 1) θT

AMP(Z) = T iterations of Approximate Message Passing [Donoho et. al. ’09]

2) MT

AMP(snr) = T iterations of state evolution recursion [Bayati & Montanari ’11]

15 / 16

slide-34
SLIDE 34

Example II: Quantized Compressed Sensing

X = Aθ + ǫW → {1, . . . , 2nR} → ˆ θ, A ∈ Rn×dn

n/dn→δ>0

Z = Aθ +

  • ǫ2 + σ2W ′,

σ2 = 1 snr E

  • X2

n 1) θT

AMP(Z) = T iterations of Approximate Message Passing [Donoho et. al. ’09]

2) MT

AMP(snr) = T iterations of state evolution recursion [Bayati & Montanari ’11]

Corollary

1 dn E

  • θ − θT

AMP(Y )

  • 2

→ MT

AMP(22R − 1)

15 / 16

slide-35
SLIDE 35

Summary

◮ Random spherical coding: Strong equivalence between rate

R quantization error and AWGN with SNR 22R − 1

◮ Leads to a user friendly tool for evaluating performance of

estimators from compressed data X + Z N(0, 1

snrI)

22R − 1 = snr √n X Y

16 / 16