convolutional dictionary learning based auto encoders for
play

Convolutional dictionary learning based auto-encoders for natural - PowerPoint PPT Presentation

Convolutional dictionary learning based auto-encoders for natural exponential-family distributions Bahareh Tolooshams *1 , Andrew H. Song *2 , Simona Temereanca 3 , and Demba Ba 1 1 Harvard University 2 Massachusetts Institute of Technology 3


  1. Convolutional dictionary learning based auto-encoders for natural exponential-family distributions Bahareh Tolooshams *1 , Andrew H. Song *2 , Simona Temereanca 3 , and Demba Ba 1 1 Harvard University 2 Massachusetts Institute of Technology 3 Brown University * Equal contributions CRISP Group: https://crisp.seas.harvard.edu ICML 2020 Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 1 / 22

  2. 1 Motivation 2 Introduction 3 Deep Convolutional Exponential Auto-encoder (DCEA) Experiments 4 Conclusion 5 Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 2 / 22

  3. Motivation Deep Learning Signal Processing (SP) Generative models e.g., sparse coding model p ( y | x ) = Hx + ǫ ǫ ǫ, x is sparse • Slow and not scalable ✗ • Fast and scalable ✓ • Interpretable ✓ • Not interpretable ✗ • Memory efficient ✓ • Memory and computationally expensive ✗ • Benefit from scalability of deep learning for traditional SP tasks. • Guide to design interpretable and memory efficient networks. Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 3 / 22

  4. 1 Motivation 2 Introduction 3 Deep Convolutional Exponential Auto-encoder (DCEA) Experiments 4 Conclusion 5 Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 4 / 22

  5. Convolutional Dictionary Learning (CDL) Generative model for each data j C y j = ǫ j = Hx j + ǫ ǫ j ∼ N ( 0 , σ 2 I ) � h c ∗ x j ǫ j , c + ǫ ǫ ǫ ǫ ǫ c =1 where x j c is sparse. Goal : Learn H that maps sparse representation x j to data y j . J 1 � y j − Hx j � 2 � 2 + λ � x j � 1 min 2 { h c } C c =1 , { x j } J j =1 j =1 • min w.r.t. x j → Convolutional Sparse Coding (CSC) . • min w.r.t. H and x j → Convolutional Dictionary Learning (CDL) . Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 5 / 22

  6. Unfolding Networks Solve CSC and CDL by iterative proximal gradient algorithm. y x t x T ISTA [1]: ˜ y t α H T - H LISTA [2]: y W e x t x T S CSCNet [3]: y ˜ W e x t x T y t H - W d Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 6 / 22

  7. Unfolding Networks Solve CSC and CDL by iterative proximal gradient algorithm. y x t x T ISTA [1]: ˜ y t α H T - H LISTA [2]: y W e x t x T S CSCNet [3]: y ˜ W e x t x T y t H - W d Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 6 / 22

  8. Unfolding Networks Solve CSC and CDL by iterative proximal gradient algorithm. y x t x T ISTA [1]: ˜ y t α H T - H LISTA [2]: y W e x t x T S CSCNet [3]: y ˜ W e x t x T y t H - W d Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 6 / 22

  9. What if the observations are no longer Gaussian? Count-valued data Fingerprint Photon-based imaging Classical CDL approach : Alternating minimization with a Poisson generative model [4, 5]. • Unsupervised ✓ • Follows a generative model ⇒ interpretable ✓ • Not scalable (can take minutes ∼ hours to denoise single image) ✗ Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 7 / 22

  10. Our Contributions Encoder Decoder Repeat T times y ˜ α H T x t x T y t H - f − 1 ( · ) H • Auto-encoder inspired by CDL, termed D eep C onvolutional E xponential A uto-encoder ( DCEA ), for non real-valued data • Demonstration of the flexibility of DCEA for both • unsupervised task, e.g., CDL • supervised task, e.g., Poisson denoising problem • Gradient dynamics of shallow exponential auto-encoder (SEA) • Prove that SEA recovers parameters of the generative model. Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 8 / 22

  11. 1 Motivation 2 Introduction 3 Deep Convolutional Exponential Auto-encoder (DCEA) Experiments 4 Conclusion 5 Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 9 / 22

  12. Deep Convolutional Exponential Auto-encoder Problem description Natural exponential family with convolutional generative model: � T y + g ( y ) − B � � � log p ( y | µ µ µ ) = f µ µ , where f ( µ µ µ ) = Hx , x is sparse . µ µ µ µ Inverse link: f − 1 ( · ) B(z) y z T z Gaussian R I ( · ) -1 T log( 1 − z ) Binomial [0 ..M ] sigmoid ( · ) 1 T z Poisson [0 .. ∞ ) exp( · ) Exponential Convolutional Dictionary Learning (ECDL): negative log-likelihood code sparsity constraint � �� � � �� � min − log p ( y | µ µ µ ) + λ � x � 1 H , x Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 10 / 22

  13. Deep Convolutional Exponential Auto-encoder Network architecture Encoder Decoder Repeat T times y ˜ x t x T y t α H T H - f − 1 ( · ) H Components for different distributions f − 1 ( · ) Encoder Unfolding ( x t ) Decoder ( f (ˆ µ µ µ ) ) y � � x t − 1 + α H T � Gaussian R I ( · ) S b y t Hx T � � x t − 1 + α H T ( 1 Binomial [0 ..M ] sigmoid ( · ) S b M � y t ) Hx T � � x t − 1 + α H T ( Elu ( � Poisson [0 .. ∞ ) exp( · ) S b y t )) Hx T Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 11 / 22

  14. Deep Convolutional Exponential Auto-encoder Training & inference Forward pass Repeat T times y ˜ x t x T y t α H T L H - y f − 1 ( · ) H Backward pass Training • Forward pass : Estimate code x T & compute loss function. • Backward pass (back-propagation): Estimate dictionary H . • Equivalent to alternating minimization in CDL. Inference : Once trained, the inference (forward pass) is fast. Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 12 / 22

  15. Unsupervised → Supervised Repurpose DCEA for supervised tasks with two modifications 1 Loss function : Any supervised loss function, e.g., reconstruction MSE loss or perceptual loss. 2 Architecture : Relax the constraints → Untie the weights of encoder and decoder, learn the bias b . Encoder Decoder � x t − 1 + α H T ( y − f − 1 � � � Original x t = S b Hx t − 1 ) Hx T � x t − 1 + α ( W e ) T ( y − f − 1 � � � W d x t − 1 Relaxed x t = S b ) Hx T • Further relaxations possible, i.e., deep & non-linear decoder. Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 13 / 22

  16. 1 Motivation 2 Introduction 3 Deep Convolutional Exponential Auto-encoder (DCEA) Experiments 4 Conclusion 5 Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 14 / 22

  17. Experiments Poisson image denoising Baseline frameworks Supervised? Description SPDA [5] ✗ ECDL + patch-based CA [6] ✓ denoising NN DCEA-C (ours) ✓ constrained DCEA (tied weights) DCEA-UC (ours) ✓ unconstrained DCEA (untied weights) PSNR performance on test dataset Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 15 / 22

  18. Experiments Poisson image denoising Original Noisy peak = 4 DCEA-C DCEA-UC Original Noisy peak = 2 DCEA-C DCEA-UC Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 16 / 22

  19. Experiments Poisson image denoising • Classical ECDL : SPDA vs. DCEA-C ⇒ better denoising + much more efficient ⇒ classical inference task leveraging scalability of NN • Denoising NN : CA vs. DCEA-UC ⇒ competitive denoising + much less parameters ⇒ NN architecture leveraging generative model paradigm Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 17 / 22

  20. Experiments CDL for simulated binomial data Figure: Example of simulated neural spikes and the rate (truth) Figure: Random initialized (Blue), true (Orange), and learned templates (Green) Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 18 / 22

  21. Experiments CDL for simulated binomial data • If we untie the weights, i.e., relax generative model constraints (a) c = 1 (b) c = 2 0 . 4 0 . 2 0 . 2 0 . 0 − 0 . 2 0 . 0 − 0 . 4 True Learned − 0 . 2 − 0 . 6 0 20 40 0 20 40 Time [ms] Time [ms] • If we treat binomial data as Gaussian obs., i.e., model mismatch Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 19 / 22

  22. 1 Motivation 2 Introduction 3 Deep Convolutional Exponential Auto-encoder (DCEA) Experiments 4 Conclusion 5 Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 20 / 22

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend