Knowledge Transfer Using Latent Variable Models Ayan Acharya UT - PowerPoint PPT Presentation

Background Concurrent Knowledge Transfer Continual Knowledge Transfer Conclusion Backup Knowledge Transfer Using Latent Variable Models Ayan Acharya UT Austin, Department of ECE July 21, 2015

Background Concurrent Knowledge Transfer Continual Knowledge Transfer Conclusion Backup Motivation & Theme Motivation Labeled data is sparse in applications like document categorization and object recognition. Distribution of data changes across domains or over time. Theme Shared low dimensional space for transferring information across domains Careful adaptation of the model parameters to fit new data

Background Concurrent Knowledge Transfer Continual Knowledge Transfer Conclusion Backup Transfer Learning Transfer Learning Concurrent knowledge transfer (or multitask learning): multiple domains learnt simultaneously Continual knowledge transfer (or sequential knowledge transfer): models learnt in one domain are carefully adapted to other domains

Background Concurrent Knowledge Transfer Continual Knowledge Transfer Conclusion Backup Active Learning only the most informative examples are queried from the unlabeled pool Figure: Illustration of Active Learning (Pic Courtesy: Burr Settles)

Background Concurrent Knowledge Transfer Continual Knowledge Transfer Conclusion Backup Section Outline Multitask Learning Using Both Supervised and Latent Shared Topics (ECML 2013) Active Multitask Learning Using Both Supervised and Latent Shared Topics (NIPS13 Topic Model Workshop, SDM 2014) Active Multitask Learning with Annotator’s Rationale Joint Modeling of Network and Documents using Gamma Process Poisson Factorization (KDD SRS Workshop 2015, ECML 2015)

Background Concurrent Knowledge Transfer Continual Knowledge Transfer Conclusion Backup Multitask Learning Using Both Supervised and Latent Shared Topics (ECML 2013)

Background Concurrent Knowledge Transfer Continual Knowledge Transfer Conclusion Backup Problem Setting In training corpus each document/image belongs to a known class and has a set of attributes (supervised topics). aYahoo – Classes : carriage, centaur, bag, building, donkey, goat, jetski, monkey, mug, statue, wolf, and zebra; Attributes : “has head”, “has wheel”, “has torso” and 61 others ACM Conf. – Classes : ICML, KDD, SIGIR, WWW, ISPD, DAC; Attributes : keywords Train models using words, supervised topics and class labels, and classify completely unlabeled test data (no supervised topic or class label)

Background Concurrent Knowledge Transfer Continual Knowledge Transfer Conclusion Backup Doubly Supervised Laten Dirichlet Allocation (DSLDA) α (1) α (2) θ Λ z ‘ w β Y M n K N r Figure: DSLDA – Supervision at Figure: Visual Representation both topic and category level Variational EM used for inference and learning

Background Concurrent Knowledge Transfer Continual Knowledge Transfer Conclusion Backup Multitask Learning Results: aYahoo observation: multitask learning method with latent and supervised topics performs better compared to other methods

Background Concurrent Knowledge Transfer Continual Knowledge Transfer Conclusion Backup Active Multitask Learning Using Both Supervised and Latent Shared Topics (NIPS13 Topic Model Workshop, SDM 2014)

Background Concurrent Knowledge Transfer Continual Knowledge Transfer Conclusion Backup Problem Setting Figure: Visual Representation of Active Doubly Supervised Latent Dirichlet Allocation (Act-DSLDA) An active MTL framework that can use and query over both attributes and class labels Active learning measure: expected error reduction Batch mode: variational EM, online SVM Active selection mode: incremental EM, online SVM

Background Concurrent Knowledge Transfer Continual Knowledge Transfer Conclusion Backup Active Multitask Learning Results: ACM Conf. Query Distribution observation: more category labels ( e.g. KDD, ICML, ISPD) queried in the initial phase, more attributes (keywords) queried later on

Background Concurrent Knowledge Transfer Continual Knowledge Transfer Conclusion Backup Active Multitask Learning Using Annotators’ Rationale

Background Concurrent Knowledge Transfer Continual Knowledge Transfer Conclusion Backup Problem Setting An active multitask learning framework that can query over attributes, class labels and their rationales

Background Concurrent Knowledge Transfer Continual Knowledge Transfer Conclusion Backup Results for Active Multitask Learning with Rationale: ACM Conf. Figure: Query Distribution Figure: Learning Curve observation: active learning method with rationales and supervised topics performs much better compared to baselines

Background Concurrent Knowledge Transfer Continual Knowledge Transfer Conclusion Backup Active Rationale Results: ACM Conf. Figure: Query Distribution: ACM Conf. observation: more labels with rationales queried in the initial phase

Background Concurrent Knowledge Transfer Continual Knowledge Transfer Conclusion Backup Gamma Process Poisson Factorization for Joint Modeling of Network and Documents (ECML 2015)

Background Concurrent Knowledge Transfer Continual Knowledge Transfer Conclusion Backup GPPF for Joint Network and Topic Modeling (J-GPPF)

Background Concurrent Knowledge Transfer Continual Knowledge Transfer Conclusion Backup Characteristics of J-GPPF Poisson factorization: Y dw ≥ Pois( È θ d , β w Í ), samples latent counts corresponding to non-zeros only Joint Poisson factorization for imputing a graph Hierarchy of Gamma priors for less sensitivity towards initialization Non-parametric modeling with closed form inference updates

Background Concurrent Knowledge Transfer Continual Knowledge Transfer Conclusion Backup Negative Binomial Distribution (NB) Number of heads seen until r number of tails occurs while tossing a biased coin with probability of head p (or, number of successes before r failures in successive Bernoulli trials): m ≥ NB( r , p ) m ≥ Poisson( ⁄ ) , ⁄ ≥ Gam( r , p ) – Gamma-Poisson Construction ¸ ÿ m ≥ u t , u t ≥ Log( p ), ¸ ≥ Poisson( ≠ r log(1 ≠ p )) – Compound Poisson t =1 Construction Gamma-Poisson Construction Compound Poisson Construction Figure: Constructions of Negative Binomial Distribution Lemma If m ≥ NB ( r , p ) is represented under its compound Poisson representation, then the conditional posterior of ¸ given m and r is given by ( ¸ | m , r ) ≥ CRT ( m , r ) , which can be generated via ¸ = q m n =1 z n , z n ≥ Bernoulli ( r / ( n ≠ 1 + r )) .

Background Concurrent Knowledge Transfer Continual Knowledge Transfer Conclusion Backup Inference of Shape Parameter of Gamma Distribution x i ∼ Pois( m i r 2 ) ∀ i ∈ { 1 , 2 , · · · , N } , r 2 ∼ Gam( r 1 , 1 / d ), r 1 ∼ Gam( a , 1 / b ). Lemma If x i ∼ Pois ( m i r 2 ) ∀ i, r 2 ∼ Gam ( r 1 , 1 / d ) , r 1 ∼ Gam ( a , 1 / b ) , then ( r 1 | − ) ∼ Gam ( a + ¸ , 1 / ( b − log (1 − p ))) where ( ¸ |{ x i } i , r 1 ) ∼ CRT ( q i x i , r 1 ) , p = q i m i / ( d + q i m i ) .

Background Concurrent Knowledge Transfer Continual Knowledge Transfer Conclusion Backup J-GPPF Results: Real-world Data Figure: (a) AUC on NIPS, (b) AUC on Twitter, (c) MAP on NIPS, (d) MAP on Twitter

Background Concurrent Knowledge Transfer Continual Knowledge Transfer Conclusion Backup Section Outline Bayesian Combination of Classification and Clustering Ensembles (SDM 2013) Nonparametric Dynamic Models Nonparametric Bayesian Factor Analysis for Dynamic Count Matrices (AISTATS 2015) Nonparametric Dynamic Relational Model (KDD MiLeTs Workshop 2015) Nonparametric Dynamic Count Matrix Factorization

Background Concurrent Knowledge Transfer Continual Knowledge Transfer Conclusion Backup Bayesian Combination of Classifier and Clustering Ensemble (SDM 2013)

Background Concurrent Knowledge Transfer Continual Knowledge Transfer Conclusion Backup B ayesian C ombination of C lassifier and C lustering E nsemble w (2) w (2) w (2) w (1) w (1) w (1) · · · · · · r 2 1 2 r 1 1 2 x 1 4 5 · · · 4 x 1 2 3 · · · 1 x 2 2 4 · · · 4 x 2 1 3 · · · 1 · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · 2 4 · · · 2 x N 2 3 · · · 3 x N Table: From Clusterings Table: From Classifiers Prior Work – C 3 E: An Optimization Framework for Combining Ensembles of Classifiers and Clusterers with Applications to Nontransductive Semisupervised Learning and Transfer Learning (Acharya et. al. , 2014), Appeared in ACM Transaction on KDD

Background Concurrent Knowledge Transfer Continual Knowledge Transfer Conclusion Backup Nonparametric Bayesian Factor Analysis for Dynamic Count Matrices (AISTATS 2015)

Background Concurrent Knowledge Transfer Continual Knowledge Transfer Conclusion Backup Gamma Poisson Autoregressive Model ◊ t ≥ Gam( ◊ ( t − 1) , 1 / c ) , n t ≥ Pois( ◊ t ). Gamma-Gamma construction breaks conjugacy

Background Concurrent Knowledge Transfer Continual Knowledge Transfer Conclusion Backup Inference in Gamma Poisson Autoregressive Model Gamma NB ◊ ( T − 2) ◊ ( T − 1) n T Poisson Poisson n ( T − 2) n ( T − 1) use Gamma-Poisson construction of NB n T ≥ NB( ◊ ( T − 1) , 1 / ( c + 1)).

Knowledge Transfer Using Latent Variable Models Ayan Acharya UT - PowerPoint PPT Presentation

Background Concurrent Knowledge Transfer Continual Knowledge Transfer Conclusion Backup Knowledge Transfer Using Latent Variable Models Ayan Acharya UT Austin, Department of ECE July 21, 2015 Background Concurrent Knowledge Transfer

1 Latent variable models In the next section we will discuss latent variable models for

Latent Variable Models CS3750 Xiaoting Li 1 Out utli line Latent Variable Models

Learning Overcomplete Latent Variable Models through Tensor Methods Anima Anandkumar UC Irvine

Part III: Latent Tree Models Le Song ICML 2012 Tutorial on Spectral Algorithms for Latent

Pengtao Xie Joint work with Yuntian Deng and Eric Xing Carnegie Mellon University 1 Latent

Latent Variable Models Stefano Ermon, Aditya Grover Stanford University Lecture 6 Stefano

Learning Latent Variable Models through Tensor Methods Anima Anandkumar U.C. Irvine Challenges

Numberjack User Guide May 27, 2013 1 Variables Constructor for the class Variable : Constructor

Guaranteed Learning of Latent Variable Models through Tensor Methods Furong Huang University of

Discrete Latent Variable Models Stefano Ermon, Aditya Grover Stanford University Lecture 15

Outline Latent Variable Generative Models Cooperative Vector Quantizer Model Model

Maximum Reconstruction Estimation for Generative Latent-Variable Models Yong Cheng joint work

Latent Variable models for GWAs Oliver Stegle Machine Learning and Computational Biology Research

Distributed Variational Inference in Sparse Gaussian Process Regression and Latent Variable Models

Learning Overcomplete Latent Variable Models through Tensor Methods Majid Janzamin UC Irvine

Variable selection bias Bias in Ensemble Bias in Ensemble Methods Methods Variable selection

Latent Dimensions of Religion and Spirituality: A Longitudinal Correlated Topic Model Seong-Hyeon

Nonparametric spectral-based estimation of latent structures Stphane Bonhomme (Chicago), Koen

Event Generation and Statistical Sampling with Deep Generative Models Rob Verheyen Introduction

Video Synthesis from the StyleGAN Latent Space Advisor Dr. Chris Pollett Committee Members By

Learning Semantic Visual Codebook for Action Recognition by Embedding into Concept Space Behrouz

Policy Evaluation with Latent Confounders via Optimal Balance Andrew Bennett 1 Cornell University

A Tutorial on Deep Probabilistic Generative Models Ryan P. Adams Princeton University Machine

19 Auto Lecture encoders : Ankur Bambhanoliya Scribes : Donald Hamnett Motivation