Latent LSTM Allocation Manzil Zaheer, Amr Ahmed and Alexander J - PowerPoint PPT Presentation

Latent LSTM Allocation Manzil Zaheer, Amr Ahmed and Alexander J Smola Presented by Akshay Budhkar & Krishnapriya Vishnubhotla March 3, 2018 Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 1 / 22

Outline Introduction 1 Latent Dirichlet Allocation LSTMs Latent LSTM Allocation 2 Algorithm Inference Different Models Results 3 Conclusion 4 Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 2 / 22

Latent Dirichlet Allocation Probabilistic graphical model Not sequential, but easily interpretable. Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 3 / 22

LSTMs Good for modeling sequential data, preserves temporal aspect Too many parameters Hard to interpret Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 4 / 22

Latent LSTM Allocation (LLA) - Algorithm Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 5 / 22

Graphical model for LLA Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 6 / 22

Marginal probability of observing a document is � p ( w d | LSTM , φ ) = p ( w d , z d | LSTM , φ ) z d (1) � � = p ( w d , t | z d , t ; φ ) p ( z d , t | z d , 1: t − 1 ; LSTM ) z d t Uses a K × H dense matrix and a V × K sparse matrix. Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 7 / 22

Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 8 / 22

Inference Stochastic Expectation Maximization is used to compute the posterior. The Evidence Lower Bound (ELBO) can be written as: � log p ( w d | LSTM , φ ) d (2) q ( z ) log p ( z d ; LSTM ) � t p ( w d , t | z d , t ; φ ) � � ≥ q ( z d ) z d d Conditional probability of topic at time step t is: p ( z d , t = k | w d , t , z d , 1: t − 1 | LSTM , φ ) (3) ∝ p ( z d , t = k | z d , 1: t ; LSTM ) p ( w d , t | z d , t = k ; φ ) And p ( w d , t | z d , t = k ; φ ) = φ w , k = n w , k + β (4) n k + V β Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 9 / 22

Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 10 / 22

Mathematical Intuition LDA � log p ( w ) = log p ( w t | model ) t (5) � � = log p ( w t | z t ) p ( z t | doc ) t z t LSTM � log p ( w ) = log p ( w t | w t − 1 , w t − 2 , . . . , w 1 ) (6) t LLA � � log p ( w ) = log p ( w t | z t ) p ( z t | z t − 1 , z t − 2 , . . . , z 1 ) (7) z 1: T t Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 11 / 22

Different Models Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 12 / 22

Perplexity vs. Number of topics (Wikipedia) Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 13 / 22

Perplexity vs. Number of topics (User Search) Cannot use Char LLA, since URLs lack morphological structure Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 14 / 22

LDA Ablation Study Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 15 / 22

Interpreting Cleaner Topics Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 16 / 22

Interpreting Factored Topics Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 17 / 22

LSTM Topic Embedding (Wikipedia) Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 18 / 22

Convergence Speed Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 19 / 22

Effect of Joint vs. Independent Training Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 20 / 22

Final Thoughts Pros Provides a knob for interpretability and accuracy Less number of parameters for a reasonable perplexity Cleaner factored topics Cons Did not compare to something like hierarchical LDA Can’t use Char LLA for every problem Perplexity is not a good measure of text generation accuracy Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 21 / 22

Bibliography Blei, D. M., Ng, A. Y., and Jordan, M. I. (2003). Latent dirichlet allocation. Journal of machine Learning research , 3(Jan):993–1022. Galley, M., Brockett, C., Sordoni, A., Ji, Y., Auli, M., Quirk, C., Mitchell, M., Gao, J., and Dolan, B. (2015). deltableu: A discriminative metric for generation tasks with intrinsically diverse targets. arXiv preprint arXiv:1506.06863 . Hochreiter, S. and Schmidhuber, J. (1997). Long short-term memory. Neural computation , 9(8):1735–1780. Zaheer, M., Ahmed, A., and Smola, A. J. (2017). Latent lstm allocation: Joint clustering and non-linear dynamic modeling of sequence data. In International Conference on Machine Learning , pages 3967–3976. Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 22 / 22

Latent LSTM Allocation Manzil Zaheer, Amr Ahmed and Alexander J - PowerPoint PPT Presentation

Latent LSTM Allocation Manzil Zaheer, Amr Ahmed and Alexander J Smola Presented by Akshay Budhkar & Krishnapriya Vishnubhotla March 3, 2018 Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya

Attention Graham Neubig Site https://phontron.com/class/nn4nlp2017/ Encoder-decoder Models

Attention Graham Neubig Site https://phontron.com/class/nn4nlp2020/ Encoder-decoder Models

Multi-Dimensional LSTM Networks for Video Prediction Wonmin Byeon NVIDIA Research March 29, 2018

Class 15 - Long Short-Term Memory (LSTM) Class 15 - Long Short-Term Memory (LSTM) Study materials

E-LSTM: Efficient Inference of Sparse LSTM on Embedded Heterogeneous System Runbin Shi 1 Junjie

Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting LSTM

Latent Dirichlet allocation IN TR OD U C TION TO TE XT AN ALYSIS IN R Marc Dotson Assistant

Sunday Homework 3 : an Diniohlet Allocation Model Latent Generative : Generative model

1 Latent variable models In the next section we will discuss latent variable models for

Part III: Latent Tree Models Le Song ICML 2012 Tutorial on Spectral Algorithms for Latent

More Register Allocation Last time Register allocation Global allocation via graph

Bus Arrival Time Prediction with LSTM Neural Network A. Agafonov, A. Yumaganov Samara National

AMMI Introduction to Deep Learning 11.2. LSTM and GRU Fran cois Fleuret

An Introduction to Neural Networks Long Short Term Memory (LSTM) and the Attention mechanism Ange

LSTM: A Search Space Odyssey Klaus Greff, Rupesh K. Srivastava, Jan Koutn k, Bas R.

LSTM Neural Reordering Model for Statistical Machine Translation Yiming Cui, Shijin Wang,

IPv6 Changes in Mobile IPv6 from Connectathon David B. Johnson The Monarch Project Carnegie

Combinatorics in Hungary and Extremal Set Theory Gyula O.H. Katona R enyi Institute, Budapest

SOCIAL CHOICE THEORY A mathematical theory that deals with aggregation of individual

S u bq u eries inside WHERE and SELECT cla u ses J OIN IN G DATA IN SQL Chester Isma y Data

1 Peter Series Lesson #157 December 27, 2018 Dean Bible Ministries www.deanbibleministries.org

Whole Person Care: Aiming for better Physical and Mental Health Care Mental Health in Cancer

Betting the Minimum Gaming in the U.S. and State Revenues 1 Oregon Office of Economic Analysis

HEAL makes serving the poorest of the poor a first choice Practicing Health Equity: for health

Sambuz

Useful Links

Newsletter

Mail Us