Convolutional dictionary learning based auto-encoders for natural - PowerPoint PPT Presentation

Convolutional dictionary learning based auto-encoders for natural exponential-family distributions Bahareh Tolooshams *1 , Andrew H. Song *2 , Simona Temereanca 3 , and Demba Ba 1 1 Harvard University 2 Massachusetts Institute of Technology 3 Brown University * Equal contributions CRISP Group: https://crisp.seas.harvard.edu ICML 2020 Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 1 / 22

1 Motivation 2 Introduction 3 Deep Convolutional Exponential Auto-encoder (DCEA) Experiments 4 Conclusion 5 Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 2 / 22

Motivation Deep Learning Signal Processing (SP) Generative models e.g., sparse coding model p ( y | x ) = Hx + ǫ ǫ ǫ, x is sparse • Slow and not scalable ✗ • Fast and scalable ✓ • Interpretable ✓ • Not interpretable ✗ • Memory efficient ✓ • Memory and computationally expensive ✗ • Benefit from scalability of deep learning for traditional SP tasks. • Guide to design interpretable and memory efficient networks. Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 3 / 22

Convolutional Dictionary Learning (CDL) Generative model for each data j C y j = ǫ j = Hx j + ǫ ǫ j ∼ N ( 0 , σ 2 I ) � h c ∗ x j ǫ j , c + ǫ ǫ ǫ ǫ ǫ c =1 where x j c is sparse. Goal : Learn H that maps sparse representation x j to data y j . J 1 � y j − Hx j � 2 � 2 + λ � x j � 1 min 2 { h c } C c =1 , { x j } J j =1 j =1 • min w.r.t. x j → Convolutional Sparse Coding (CSC) . • min w.r.t. H and x j → Convolutional Dictionary Learning (CDL) . Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 5 / 22

Unfolding Networks Solve CSC and CDL by iterative proximal gradient algorithm. y x t x T ISTA [1]: ˜ y t α H T - H LISTA [2]: y W e x t x T S CSCNet [3]: y ˜ W e x t x T y t H - W d Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 6 / 22

What if the observations are no longer Gaussian? Count-valued data Fingerprint Photon-based imaging Classical CDL approach : Alternating minimization with a Poisson generative model [4, 5]. • Unsupervised ✓ • Follows a generative model ⇒ interpretable ✓ • Not scalable (can take minutes ∼ hours to denoise single image) ✗ Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 7 / 22

Our Contributions Encoder Decoder Repeat T times y ˜ α H T x t x T y t H - f − 1 ( · ) H • Auto-encoder inspired by CDL, termed D eep C onvolutional E xponential A uto-encoder ( DCEA ), for non real-valued data • Demonstration of the flexibility of DCEA for both • unsupervised task, e.g., CDL • supervised task, e.g., Poisson denoising problem • Gradient dynamics of shallow exponential auto-encoder (SEA) • Prove that SEA recovers parameters of the generative model. Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 8 / 22

Deep Convolutional Exponential Auto-encoder Problem description Natural exponential family with convolutional generative model: � T y + g ( y ) − B � � � log p ( y | µ µ µ ) = f µ µ , where f ( µ µ µ ) = Hx , x is sparse . µ µ µ µ Inverse link: f − 1 ( · ) B(z) y z T z Gaussian R I ( · ) -1 T log( 1 − z ) Binomial [0 ..M ] sigmoid ( · ) 1 T z Poisson [0 .. ∞ ) exp( · ) Exponential Convolutional Dictionary Learning (ECDL): negative log-likelihood code sparsity constraint � �� min − log p ( y | µ µ µ ) + λ � x � 1 H , x Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 10 / 22

Deep Convolutional Exponential Auto-encoder Network architecture Encoder Decoder Repeat T times y ˜ x t x T y t α H T H - f − 1 ( · ) H Components for different distributions f − 1 ( · ) Encoder Unfolding ( x t ) Decoder ( f (ˆ µ µ µ ) ) y � � x t − 1 + α H T � Gaussian R I ( · ) S b y t Hx T � � x t − 1 + α H T ( 1 Binomial [0 ..M ] sigmoid ( · ) S b M � y t ) Hx T � � x t − 1 + α H T ( Elu ( � Poisson [0 .. ∞ ) exp( · ) S b y t )) Hx T Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 11 / 22

Deep Convolutional Exponential Auto-encoder Training & inference Forward pass Repeat T times y ˜ x t x T y t α H T L H - y f − 1 ( · ) H Backward pass Training • Forward pass : Estimate code x T & compute loss function. • Backward pass (back-propagation): Estimate dictionary H . • Equivalent to alternating minimization in CDL. Inference : Once trained, the inference (forward pass) is fast. Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 12 / 22

Unsupervised → Supervised Repurpose DCEA for supervised tasks with two modifications 1 Loss function : Any supervised loss function, e.g., reconstruction MSE loss or perceptual loss. 2 Architecture : Relax the constraints → Untie the weights of encoder and decoder, learn the bias b . Encoder Decoder � x t − 1 + α H T ( y − f − 1 � � � Original x t = S b Hx t − 1 ) Hx T � x t − 1 + α ( W e ) T ( y − f − 1 � � � W d x t − 1 Relaxed x t = S b ) Hx T • Further relaxations possible, i.e., deep & non-linear decoder. Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 13 / 22

Experiments Poisson image denoising Baseline frameworks Supervised? Description SPDA [5] ✗ ECDL + patch-based CA [6] ✓ denoising NN DCEA-C (ours) ✓ constrained DCEA (tied weights) DCEA-UC (ours) ✓ unconstrained DCEA (untied weights) PSNR performance on test dataset Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 15 / 22

Experiments Poisson image denoising Original Noisy peak = 4 DCEA-C DCEA-UC Original Noisy peak = 2 DCEA-C DCEA-UC Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 16 / 22

Experiments Poisson image denoising • Classical ECDL : SPDA vs. DCEA-C ⇒ better denoising + much more efficient ⇒ classical inference task leveraging scalability of NN • Denoising NN : CA vs. DCEA-UC ⇒ competitive denoising + much less parameters ⇒ NN architecture leveraging generative model paradigm Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 17 / 22

Experiments CDL for simulated binomial data Figure: Example of simulated neural spikes and the rate (truth) Figure: Random initialized (Blue), true (Orange), and learned templates (Green) Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 18 / 22

Experiments CDL for simulated binomial data • If we untie the weights, i.e., relax generative model constraints (a) c = 1 (b) c = 2 0 . 4 0 . 2 0 . 2 0 . 0 − 0 . 2 0 . 0 − 0 . 4 True Learned − 0 . 2 − 0 . 6 0 20 40 0 20 40 Time [ms] Time [ms] • If we treat binomial data as Gaussian obs., i.e., model mismatch Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 19 / 22

Convolutional dictionary learning based auto-encoders for natural - PowerPoint PPT Presentation

Convolutional dictionary learning based auto-encoders for natural exponential-family distributions Bahareh Tolooshams 1 , Andrew H. Song 2 , Simona Temereanca 3 , and Demba Ba 1 1 Harvard University 2 Massachusetts Institute of Technology 3

Variational Auto-encoders 2 VARIATIONAL AUTO-ENCODERS INTRODUCTION VARIATIONAL AUTO-ENCODERS

Variational Auto-Encoders without (too) much math Stphane dAscoli Roadmap 1. A reminder

Correlated Variational Auto-Encoders Da Tang 1 Dawen Liang 2 Tony Jebara 1 , 2 Nicholas Ruozzi 3 1

KODA AUTO University KODA AUTO University Agenda on KODA AUTO University Enterprise

KODA AUTO University KODA AUTO University Agenda on KODA AUTO University Enterprise

Unit B - Rotary Encoders B.2 Rotary Encoders Electromechanical devices used to measure the

Rotary Encoders 2 Rotary Encoders Electromechanical devices used to measure the angular

Convolutional Neural Networks ---- Off the shelf top notch performances Convolutional Neural

The Dictionary ADT The dictionary ADT models a searchable collection findElement(k): if the

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Nonparametric Variational Auto-encoders for Hierarchical Representation Learning Prasoon Goyal,

Variational Auto-Encoders Diederik P. Kingma Introduction and Motivation Motivation and

Advanced Machine Learning Variational Auto-encoders Amit Sethi, EE, IITB Objectives Learn

CS 4803 / 7643: Deep Learning Topics: Variational Auto-Encoders (VAEs)

Advanced Machine Learning Variational Auto-encoders Amit Sethi, EE, IITB Objectives Learn

The Korean Auto & Auto Parts Industry Chapter 1. The Status of Korean Auto Industry 2 1

CRISP Leveraging Hospital Data for Population Health Interventions Claudine Williams, MA Deputy

CRISP-DM: The life cicle of a data mining project KDD Process Business understanding

Axiomatizing Cubical Sets Models of Univalent Foundations Andrew Pitts Computer Science &

CRISP Report to the DC HIE Policy Board October 15 th , 2020 1140 3 rd St. NE Washington, DC 20002

in Agent-Based Simulation: A Case Study of Soccer Penalty Tuong Vu (txv@cs.nott.ac.uk)

Can We Detect Crisp Sets Based Only on How to Detect 1- . . . the Subsethood Ordering of Fuzzy

Towards an automated testing environment Amogh Vasekar amogh.vasekar@citrix.com Agenda Motivation

Datacenter Trends and Challenges 50% 2x 2x 2x 8x Application Growth Reduction in Data

Convolutional dictionary learning based auto-encoders for natural - PowerPoint PPT Presentation

Convolutional dictionary learning based auto-encoders for natural exponential-family distributions Bahareh Tolooshams *1 , Andrew H. Song *2 , Simona Temereanca 3 , and Demba Ba 1 1 Harvard University 2 Massachusetts Institute of Technology 3

Variational Auto-encoders 2 VARIATIONAL AUTO-ENCODERS INTRODUCTION VARIATIONAL AUTO-ENCODERS

Variational Auto-Encoders without (too) much math Stphane dAscoli Roadmap 1. A reminder

Correlated Variational Auto-Encoders Da Tang 1 Dawen Liang 2 Tony Jebara 1 , 2 Nicholas Ruozzi 3 1

KODA AUTO University KODA AUTO University Agenda on KODA AUTO University Enterprise

KODA AUTO University KODA AUTO University Agenda on KODA AUTO University Enterprise

Unit B - Rotary Encoders B.2 Rotary Encoders Electromechanical devices used to measure the

Rotary Encoders 2 Rotary Encoders Electromechanical devices used to measure the angular

Convolutional Neural Networks ---- Off the shelf top notch performances Convolutional Neural

The Dictionary ADT The dictionary ADT models a searchable collection findElement(k): if the

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Nonparametric Variational Auto-encoders for Hierarchical Representation Learning Prasoon Goyal,

Variational Auto-Encoders Diederik P. Kingma Introduction and Motivation Motivation and

Advanced Machine Learning Variational Auto-encoders Amit Sethi, EE, IITB Objectives Learn

CS 4803 / 7643: Deep Learning Topics: Variational Auto-Encoders (VAEs)

Advanced Machine Learning Variational Auto-encoders Amit Sethi, EE, IITB Objectives Learn

The Korean Auto &amp; Auto Parts Industry Chapter 1. The Status of Korean Auto Industry 2 1

CRISP Leveraging Hospital Data for Population Health Interventions Claudine Williams, MA Deputy

CRISP-DM: The life cicle of a data mining project KDD Process Business understanding

Axiomatizing Cubical Sets Models of Univalent Foundations Andrew Pitts Computer Science &amp;

CRISP Report to the DC HIE Policy Board October 15 th , 2020 1140 3 rd St. NE Washington, DC 20002

in Agent-Based Simulation: A Case Study of Soccer Penalty Tuong Vu (txv@cs.nott.ac.uk)

Can We Detect Crisp Sets Based Only on How to Detect 1- . . . the Subsethood Ordering of Fuzzy

Towards an automated testing environment Amogh Vasekar amogh.vasekar@citrix.com Agenda Motivation

Datacenter Trends and Challenges 50% 2x 2x 2x 8x Application Growth Reduction in Data

Convolutional dictionary learning based auto-encoders for natural exponential-family distributions Bahareh Tolooshams 1 , Andrew H. Song 2 , Simona Temereanca 3 , and Demba Ba 1 1 Harvard University 2 Massachusetts Institute of Technology 3

The Korean Auto & Auto Parts Industry Chapter 1. The Status of Korean Auto Industry 2 1

Axiomatizing Cubical Sets Models of Univalent Foundations Andrew Pitts Computer Science &