Convolutional dictionary learning based auto-encoders for natural - - PowerPoint PPT Presentation

convolutional dictionary learning based auto encoders for
SMART_READER_LITE
LIVE PREVIEW

Convolutional dictionary learning based auto-encoders for natural - - PowerPoint PPT Presentation

Convolutional dictionary learning based auto-encoders for natural exponential-family distributions Bahareh Tolooshams *1 , Andrew H. Song *2 , Simona Temereanca 3 , and Demba Ba 1 1 Harvard University 2 Massachusetts Institute of Technology 3


slide-1
SLIDE 1

Convolutional dictionary learning based auto-encoders for natural exponential-family distributions

Bahareh Tolooshams*1, Andrew H. Song*2, Simona Temereanca3, and Demba Ba1

1Harvard University 2Massachusetts Institute of Technology 3Brown University *Equal contributions

CRISP Group: https://crisp.seas.harvard.edu

ICML 2020

Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 1 / 22

slide-2
SLIDE 2

1

Motivation

2

Introduction

3

Deep Convolutional Exponential Auto-encoder (DCEA)

4

Experiments

5

Conclusion

Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 2 / 22

slide-3
SLIDE 3

Motivation

Deep Learning

  • Fast and scalable ✓
  • Not interpretable ✗
  • Memory and computationally

expensive ✗ Signal Processing (SP) Generative models e.g., sparse coding model p(y | x) = Hx + ǫ ǫ ǫ, x is sparse

  • Slow and not scalable ✗
  • Interpretable ✓
  • Memory efficient ✓
  • Benefit from scalability of deep learning for traditional SP tasks.
  • Guide to design interpretable and memory efficient networks.

Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 3 / 22

slide-4
SLIDE 4

1

Motivation

2

Introduction

3

Deep Convolutional Exponential Auto-encoder (DCEA)

4

Experiments

5

Conclusion

Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 4 / 22

slide-5
SLIDE 5

Convolutional Dictionary Learning (CDL)

Generative model for each data j yj =

C

  • c=1

hc ∗ xj

c + ǫ

ǫ ǫj = Hxj + ǫ ǫ ǫj, ǫ ǫ ǫj ∼ N(0, σ2I) where xj

c is sparse.

Goal: Learn H that maps sparse representation xj to data yj. min

{hc}C

c=1,{xj}J j=1

1 2

J

  • j=1

yj − Hxj2

2 + λxj1

  • min w.r.t. xj → Convolutional Sparse Coding (CSC).
  • min w.r.t. H and xj → Convolutional Dictionary Learning (CDL).

Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 5 / 22

slide-6
SLIDE 6

Unfolding Networks

Solve CSC and CDL by iterative proximal gradient algorithm. y ˜ yt αHT xt xT H ISTA [1]:

  • y

We xt xT S LISTA [2]: y ˜ yt We xt xT H Wd CSCNet [3]:

  • Harvard CRISP

Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 6 / 22

slide-7
SLIDE 7

Unfolding Networks

Solve CSC and CDL by iterative proximal gradient algorithm. y ˜ yt αHT xt xT H ISTA [1]:

  • y

We xt xT S LISTA [2]: y ˜ yt We xt xT H Wd CSCNet [3]:

  • Harvard CRISP

Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 6 / 22

slide-8
SLIDE 8

Unfolding Networks

Solve CSC and CDL by iterative proximal gradient algorithm. y ˜ yt αHT xt xT H ISTA [1]:

  • y

We xt xT S LISTA [2]: y ˜ yt We xt xT H Wd CSCNet [3]:

  • Harvard CRISP

Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 6 / 22

slide-9
SLIDE 9

What if the observations are no longer Gaussian?

Count-valued data Fingerprint Photon-based imaging Classical CDL approach: Alternating minimization with a Poisson generative model [4, 5].

  • Unsupervised ✓
  • Follows a generative model ⇒ interpretable ✓
  • Not scalable (can take minutes ∼ hours to denoise single image) ✗

Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 7 / 22

slide-10
SLIDE 10

Our Contributions

y ˜ yt αHT xt xT H H f−1(·)

Decoder Encoder

  • Repeat T times
  • Auto-encoder inspired by CDL, termed Deep Convolutional

Exponential Auto-encoder (DCEA), for non real-valued data

  • Demonstration of the flexibility of DCEA for both
  • unsupervised task, e.g., CDL
  • supervised task, e.g., Poisson denoising problem
  • Gradient dynamics of shallow exponential auto-encoder (SEA)
  • Prove that SEA recovers parameters of the generative model.

Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 8 / 22

slide-11
SLIDE 11

1

Motivation

2

Introduction

3

Deep Convolutional Exponential Auto-encoder (DCEA)

4

Experiments

5

Conclusion

Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 9 / 22

slide-12
SLIDE 12

Deep Convolutional Exponential Auto-encoder

Problem description

Natural exponential family with convolutional generative model: log p(y|µ µ µ) = f

  • µ

µ µ Ty + g(y) − B

  • µ

µ µ

  • , where f(µ

µ µ) = Hx, x is sparse.

y B(z) Inverse link: f −1(·) Gaussian R zTz I(·) Binomial [0..M]

  • 1T log(1 − z)

sigmoid(·) Poisson [0..∞) 1Tz exp(·) Exponential Convolutional Dictionary Learning (ECDL): min

H,x negative log-likelihood

  • − log p(y|µ

µ µ) +

code sparsity constraint

λx1

Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 10 / 22

slide-13
SLIDE 13

Deep Convolutional Exponential Auto-encoder

Network architecture

y ˜ yt αHT xt xT H H f−1(·)

Decoder Encoder

  • Repeat T times

Components for different distributions

y f −1(·) Encoder Unfolding (xt) Decoder (f(ˆ µ µ µ)) Gaussian R I(·) Sb

  • xt−1 + αHT

yt

  • HxT

Binomial [0..M] sigmoid(·) Sb

  • xt−1 + αHT( 1

M

yt)

  • HxT

Poisson [0..∞) exp(·) Sb

  • xt−1 + αHT (Elu(

yt))

  • HxT

Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 11 / 22

slide-14
SLIDE 14

Deep Convolutional Exponential Auto-encoder

Training & inference

y ˜ yt αHT xt xT H H f−1(·) y L

Forward pass Backward pass

  • Repeat T times

Training

  • Forward pass: Estimate code xT & compute loss function.
  • Backward pass (back-propagation): Estimate dictionary H.
  • Equivalent to alternating minimization in CDL.

Inference: Once trained, the inference (forward pass) is fast.

Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 12 / 22

slide-15
SLIDE 15

Unsupervised → Supervised

Repurpose DCEA for supervised tasks with two modifications

1 Loss function: Any supervised loss function, e.g., reconstruction

MSE loss or perceptual loss.

2 Architecture: Relax the constraints → Untie the weights of

encoder and decoder, learn the bias b.

Encoder Decoder Original xt = Sb

  • xt−1 + αHT(y − f −1

Hxt−1

  • )
  • HxT

Relaxed xt = Sb

  • xt−1 + α(We)T(y − f −1

Wdxt−1

  • )
  • HxT
  • Further relaxations possible, i.e., deep & non-linear decoder.

Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 13 / 22

slide-16
SLIDE 16

1

Motivation

2

Introduction

3

Deep Convolutional Exponential Auto-encoder (DCEA)

4

Experiments

5

Conclusion

Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 14 / 22

slide-17
SLIDE 17

Experiments

Poisson image denoising

Baseline frameworks Supervised? Description SPDA [5] ✗ ECDL + patch-based CA [6] ✓ denoising NN DCEA-C (ours) ✓ constrained DCEA (tied weights) DCEA-UC (ours) ✓ unconstrained DCEA (untied weights) PSNR performance on test dataset

Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 15 / 22

slide-18
SLIDE 18

Experiments

Poisson image denoising Original Noisy peak= 4 DCEA-C DCEA-UC Original Noisy peak= 2 DCEA-C DCEA-UC

Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 16 / 22

slide-19
SLIDE 19

Experiments

Poisson image denoising

  • Classical ECDL: SPDA vs. DCEA-C

⇒ better denoising + much more efficient ⇒ classical inference task leveraging scalability of NN

  • Denoising NN: CA vs. DCEA-UC

⇒ competitive denoising + much less parameters ⇒ NN architecture leveraging generative model paradigm

Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 17 / 22

slide-20
SLIDE 20

Experiments

CDL for simulated binomial data

Figure: Example of simulated neural spikes and the rate (truth) Figure: Random initialized (Blue), true (Orange), and learned templates (Green)

Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 18 / 22

slide-21
SLIDE 21

Experiments

CDL for simulated binomial data

  • If we untie the weights, i.e., relax generative model constraints

20 40

Time [ms]

−0.6 −0.4 −0.2 0.0 0.2 (a) c = 1 True Learned 20 40

Time [ms]

−0.2 0.0 0.2 0.4 (b) c = 2

  • If we treat binomial data as Gaussian obs., i.e., model mismatch

Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 19 / 22

slide-22
SLIDE 22

1

Motivation

2

Introduction

3

Deep Convolutional Exponential Auto-encoder (DCEA)

4

Experiments

5

Conclusion

Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 20 / 22

slide-23
SLIDE 23

Conclusion

In conclusion, Deep Convolutional Exponential Auto-encoder (DCEA)

  • is a class of NN based on a generative model for CDL, using data

from natural exponential family.

  • shows competitive performance in Poisson denoising tasks against

SOTA frameworks, with an order of magnitude fewer trainable parameters (supervised task).

  • is able to learn accurate convolutional patterns in ECDL task with

simulated binomial and real neural spiking observations (unsupervised task).

Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 21 / 22

slide-24
SLIDE 24

Reference

  • I. Daubechies, M. Defrise, and C. De Mol.

An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Communications on Pure and Applied Mathematics, 57(11):1413–1457, 2004. Karol Gregor and Yann Lecun. Learning fast approximations of sparse coding. In International Conference on Machine Learning, pages 399–406, 2010.

  • D. Simon and M. Elad.

Rethinking the CSC model for natural images. In Proc. Advances in Neural Information Processing Systems 33 (NeurIPS), 2019. Joseph Salmon, Zachary Harmany, Charles-Alban Deledalle, and Rebecca Willett. Poisson noise reduction with non-local pca. Journal of Mathematical Imaging and Vision, 48(2):279–294, Feb 2014. Raja Giryes and Michael Elad. Sparsity-based poisson denoising with dictionary learning. IEEE Transactions on Image Processing, 23(12):5057–5069, 2014. Tal Remez, Or Litany, Raja Giryes, and Alexander M. Bronstein. Class-aware fully-convolutional gaussian and poisson denoising. CoRR, abs/1808.06562, 2018. Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 22 / 22