Active Multitask Learning Using Both Supervised and Latent Shared - - PowerPoint PPT Presentation

active multitask learning using both supervised and
SMART_READER_LITE
LIVE PREVIEW

Active Multitask Learning Using Both Supervised and Latent Shared - - PowerPoint PPT Presentation

Background Act-DSLDA & Act-NPDSLDA Datasets & Empirical Results References Active Multitask Learning Using Both Supervised and Latent Shared Topics Ayan Acharya , Raymond J. Mooney, Joydeep Ghosh UT Austin, Dept. of ECE & CS April


slide-1
SLIDE 1

Background Act-DSLDA & Act-NPDSLDA Datasets & Empirical Results References

Active Multitask Learning Using Both Supervised and Latent Shared Topics

Ayan Acharya, Raymond J. Mooney, Joydeep Ghosh

UT Austin, Dept. of ECE & CS

April 24, 2014

slide-2
SLIDE 2

Background Act-DSLDA & Act-NPDSLDA Datasets & Empirical Results References

Outline

Background Act-DSLDA and Act-NPDSLDA Datasets & Empirical Results References

slide-3
SLIDE 3

Background Act-DSLDA & Act-NPDSLDA Datasets & Empirical Results References

Motivation

Multitask Learning: data from multiple tasks are collected and models are learnt simultaneously Active Learning: only the most informative examples are queried from the unlabeled pool Unify both of these approaches

slide-4
SLIDE 4

Background Act-DSLDA & Act-NPDSLDA Datasets & Empirical Results References

Problem Setting

In training corpus each document/image belongs to a known class and has a set of attributes (supervised topics). Classes from aYahoo data: carriage, centaur, bag, building, donkey, goat, jetski, monkey, mug, statue, wolf, and zebra Attributes: “has head”, “has wheel”, “has torso” and 61

  • thers

Train models using words, supervised topics and class labels An active MTL framework that can use and query over both attributes and class labels

slide-5
SLIDE 5

Background Act-DSLDA & Act-NPDSLDA Datasets & Empirical Results References

Transfer with Shared Supervised Attributes

Train to infer attributes from visual features Train to infer categories from attributes [Lampert et al., 2009]

slide-6
SLIDE 6

Background Act-DSLDA & Act-NPDSLDA Datasets & Empirical Results References

Multitask Learning with Shared Latent Features

Reference: [Caruana, 1997]

slide-7
SLIDE 7

Background Act-DSLDA & Act-NPDSLDA Datasets & Empirical Results References

Transfer with Shared Supervised and Latent Attributes

slide-8
SLIDE 8

Background Act-DSLDA & Act-NPDSLDA Datasets & Empirical Results References

Topic Models: LDA

N Mn θ z w α β K

Figure : LDA Figure : Visual Representation

slide-9
SLIDE 9

Background Act-DSLDA & Act-NPDSLDA Datasets & Empirical Results References

Topic Models: LLDA

N Mn Λ θ z w α β K

Figure : LLDA Figure : Visual Representation

slide-10
SLIDE 10

Background Act-DSLDA & Act-NPDSLDA Datasets & Empirical Results References

Topic Models: MedLDA

N Mn Y r θ z w α β K

Figure : MedLDA Figure : Visual Representation

slide-11
SLIDE 11

Background Act-DSLDA & Act-NPDSLDA Datasets & Empirical Results References

Topic Models: DSLDA

Doubly Supervised LDA [Acharya et al., 2013] α(1), α(2) : priors over supervised and latent topics N Mn Λ Y r θ ǫ z w α(1) α(2) β K

Figure : DSLDA Figure : Visual Representation

slide-12
SLIDE 12

Background Act-DSLDA & Act-NPDSLDA Datasets & Empirical Results References

Active DSLDA (Act-DSLDA)

r1 : weights for multiclass SVM r2 : weights for binary SVMs

N Mn Λ Y r1 r2 X θ ǫ z w α(1) α(2) β K

Figure : Act-DSLDA Figure : Visual Representation

slide-13
SLIDE 13

Background Act-DSLDA & Act-NPDSLDA Datasets & Empirical Results References

Active NPDSLDA (Act-NPDSLDA)

Non-parametric Doubly Supervised LDA [Acharya et al., 2013]

N Mn Λ Y r π(2) π′ c ∞ ǫ z w δ0 α(2) φ K2 β′ φ γ0 η1 η2 ∞

Figure : NPDSLDA

slide-14
SLIDE 14

Background Act-DSLDA & Act-NPDSLDA Datasets & Empirical Results References

Active NPDSLDA (Act-NPDSLDA)

Non-parametric Doubly Supervised LDA [Acharya et al., 2013]

N Mn Λ Y r π(2) π′ c ∞ ǫ z w δ0 α(2) φ K2 β′ φ γ0 η1 η2 ∞

Figure : NPDSLDA

N Mn Λ Y r1 r2 π(2) π′ c ∞ ǫ z w δ0 α(2) X φ K2 β′ φ γ0 η1 η2 ∞

Figure : Act-NPDSLDA

slide-15
SLIDE 15

Background Act-DSLDA & Act-NPDSLDA Datasets & Empirical Results References

Visual Representation of Act-NPDSLDA

Figure : Visual Representation of Act-NPDSLDA

slide-16
SLIDE 16

Background Act-DSLDA & Act-NPDSLDA Datasets & Empirical Results References

Inference and Learning

Active learning measure: expected error reduction [Nigam et al., 1998] Batch mode: variational EM with completely factorized approximation to posterior, online SVM [Bordes et al., 2007] Active selection mode: incremental EM [Neal and Hinton, 1999], online SVM

slide-17
SLIDE 17

Background Act-DSLDA & Act-NPDSLDA Datasets & Empirical Results References

Description of Dataset: ACM Conference

Classes: Conference names: WWW, SIGIR, KDD, ICML, ISPD, DAC; abstracts of papers are treated as documents Supervised topics: keywords provided by the authors

slide-18
SLIDE 18

Background Act-DSLDA & Act-NPDSLDA Datasets & Empirical Results References

Experimental Methodology

Multitask training that evaluates benefits of sharing information among classes on the predictive accuracy of all classes Start with a completely labeled dataset L consisting of 300 documents In every active iteration, 50 labels (class labels or supervised topics) are queried for.

slide-19
SLIDE 19

Background Act-DSLDA & Act-NPDSLDA Datasets & Empirical Results References

Compared Models

Model Supervised Topics Latent Topics Class Labels Act-DSLDA present & queried shared queried Act-NPDSLDA present & queried shared queried R-MedLDA-MTL absent shared random selection R-DSLDA present & random selection shared & random selection random selection Act-MedLDA-OVA absent not shared queried Act-MedLDA-MTL absent shared queried Act-DSLDA-OSST present & queried absent queried Act-DSLDA-NSLT present & queried not shared queried 1

Random MedLDA-MTL (R-MedLDA-MTL)

2

Random DSLDA (R-DSLDA)

3

Active Learning in MedLDA with one-vs-all classification (Act-MedLDA-OVA)

4

Active Learning in MedLDA with multitask learning (Act-MedLDA-MTL)

5

Act-DSLDA with only shared supervised topics (Act-DSLDA-OSST)

6

Act-DSLDA with no shared latent topics (Act-DSLDA-NSLT)

slide-20
SLIDE 20

Background Act-DSLDA & Act-NPDSLDA Datasets & Empirical Results References

Random MedLDA-MTL (R-MedLDA-MTL)

slide-21
SLIDE 21

Background Act-DSLDA & Act-NPDSLDA Datasets & Empirical Results References

Random DSLDA (R-DSLDA)

slide-22
SLIDE 22

Background Act-DSLDA & Act-NPDSLDA Datasets & Empirical Results References

Active Learning in MedLDA with one-vs-all classification (Act-MedLDA-OVA)

slide-23
SLIDE 23

Background Act-DSLDA & Act-NPDSLDA Datasets & Empirical Results References

Active Learning in MedLDA with Multitask Learning (Act-MedLDA-MTL)

slide-24
SLIDE 24

Background Act-DSLDA & Act-NPDSLDA Datasets & Empirical Results References

Act-DSLDA with Only Shared Supervised Topics (Act-DSLDA-OSST)

slide-25
SLIDE 25

Background Act-DSLDA & Act-NPDSLDA Datasets & Empirical Results References

Act-DSLDA with No Shared Latent Topics (Act-DSLDA-NSLT)

slide-26
SLIDE 26

Background Act-DSLDA & Act-NPDSLDA Datasets & Empirical Results References

aYahoo Learning Curves

slide-27
SLIDE 27

Background Act-DSLDA & Act-NPDSLDA Datasets & Empirical Results References

aYahoo Query Distribution

slide-28
SLIDE 28

Background Act-DSLDA & Act-NPDSLDA Datasets & Empirical Results References

ACM Conference Learning Curves

slide-29
SLIDE 29

Background Act-DSLDA & Act-NPDSLDA Datasets & Empirical Results References

ACM Conference Query Distribution

slide-30
SLIDE 30

Background Act-DSLDA & Act-NPDSLDA Datasets & Empirical Results References

Conclusion and Future Work

Experimental results demonstrate the utility of integrating active and multitask learning in one framework that also unifies latent and supervised shared topics. Better approximation techniques for active selection with large scale learning Active query with annotators’ rationales

slide-31
SLIDE 31

Background Act-DSLDA & Act-NPDSLDA Datasets & Empirical Results References

References

Acharya, A., Rawal, A., Mooney, R. J., and Hruschka, E. R. (2013). Using both supervised and latent shared topics for multitask learning. In ECML PKDD, Part II, LNAI 8189, pages 369–384. Bordes, A., Bottou, L., Gallinari, P., and Weston, J. (2007). Solving multiclass support vector machines with larank. In Proc. of ICML, pages 89–96. Caruana, R. (1997). Multitask learning. Machine Learning, 28:41–75. Lampert, C. H., Nickisch, H., and Harmeling, S. (2009). Learning to detect unseen object classes by betweenclass attribute transfer. In Proc. of CVPR, pages 951–958. Neal, R. M. and Hinton, G. E. (1999). A view of the EM algorithm that justifies incremental, sparse, and other variants. Nigam, K., McCallum, A., Thrun, S., and Mitchell, T. (1998). Learning to classify text from labeled and unlabeled documents. In Proceedings of the Fifteenth National Conference on Artificial Intelligence, pages 792–799. AAAI Press.

slide-32
SLIDE 32

Background Act-DSLDA & Act-NPDSLDA Datasets & Empirical Results References

Questions?