Active Learning for Probabilistic Structured Prediction of Cuts and - - PowerPoint PPT Presentation

active learning for probabilistic structured prediction
SMART_READER_LITE
LIVE PREVIEW

Active Learning for Probabilistic Structured Prediction of Cuts and - - PowerPoint PPT Presentation

University of Illinois at Chicago Active Learning for Probabilistic Structured Prediction of Cuts and Matchings Sima Behpour , University of Pennsylvania Anqi Liu, California Institute of Technology Brian D. Ziebart, University of Illinois at


slide-1
SLIDE 1

Active Learning for Probabilistic Structured Prediction

  • f Cuts and Matchings

University of Illinois at Chicago

Sima Behpour, University of Pennsylvania Anqi Liu, California Institute of Technology Brian D. Ziebart, University of Illinois at Chicago

slide-2
SLIDE 2

Motivation

2

Sea Ship Sheep Wolf Mountain Person Dog Horse Tree 1 1 1 1 1

a) Multi-label Classification [Behpour et al. 2018] b) Video Tracking

slide-3
SLIDE 3

Motivation

3

Sea Ship Sheep Wolf Mountain Person Dog Horse Tree 1 1 1 1 1

a) Multi-label Classification [Behpour et al. 2018] b) Video Tracking

Labeling can be

  • Time consuming, e.g., document classification
  • Expensive, e.g., medical decision (need doctors)
  • Sometimes dangerous, e.g., landmine detection
slide-4
SLIDE 4

4

Previous methods: ➢CRF ➢SSVM ➢ Intractable ➢ SVM Platts [Lambrou et al., 2012; Platt, 1999] ➔ Unreliable ➢ Complication of Interpretation for multi-class

Motivation

Active learning methods, like uncertainty sampling, combined with probabilistic prediction techniques [Lewis & Gale, 1994; Settles, 2012] have been successful.

slide-5
SLIDE 5

1- Leveraging Adversarial prediction methods [Behpour et al. 2018]:

  • An Adversarial approximation of the training data labels, ෘ

𝑄(ු 𝑧|𝑦)

  • A predictor, ෠

𝑄(ො 𝑧|𝑦), that minimizes the expected loss against the worst-case distribution chosen by the adversary.

Our approach

slide-6
SLIDE 6

2- Computing Mutual Information to measure reduction in uncertainty

[Guo and Greiner 2007].

Marginal entropy of Marginal entropy of

The mutual information of two discrete random variable a and b: ( the amount of the information which is held between a and b)

Joint entropy of and

Our approach

slide-7
SLIDE 7

y = [Sea, Ship, Sheep, Horse, Dog, Person, Mountain, Wolf, Tree]

[0 1 0 1 0 1 1 0 1]𝑈 [0 1 0 1 0 0 0 1 1]𝑈 [1 1 1 0 0 1 1 0 1]𝑈 L ([0 1 0 1 0 1 1 0 1]𝑈, [0 0 1 0 1 1 0 1 1]𝑼) + 𝝌 ([0 0 1 0 1 1 0 1 1]𝑼) L ([0 1 0 1 0 1 1 0 1]𝑈, [0 0 0 0 0 1 1 1 1]𝑼) + 𝝌 ([0 0 0 0 0 1 1 1 1]𝑼) L ([0 1 0 1 0 1 1 0 1]𝑈, [0 0 0 1 1 0 1 1 1]𝑼) + 𝝌 ([0 0 0 1 1 0 1 1 1]𝑼) L ([0 1 0 1 0 0 0 1 1]𝑈, [0 0 1 0 1 1 0 1 1]𝑼) + 𝝌 ([0 0 1 0 1 1 0 1 1]𝑼) L ([1 1 1 0 0 1 1 0 1]𝑈, [0 0 1 0 1 1 0 1 1]𝑼) + 𝝌 ([0 0 1 0 1 1 0 1 1]𝑼) L ([0 1 0 1 0 0 0 1 1]𝑈, [0 0 0 0 0 1 1 1 1]𝑼) + 𝝌 ([0 0 0 0 0 1 1 1 1]𝑼) L ([1 1 1 0 0 1 1 0 1]𝑈, [0 0 0 0 0 1 1 1 1]𝑼) + 𝝌 ([0 0 0 0 0 1 1 1 1]𝑼) L ([0 1 0 1 0 0 0 1 1]𝑈, [0 0 0 1 1 0 1 1 1]𝑼) + 𝝌 ([0 0 0 1 1 0 1 1 1]𝑼) L ([1 1 1 0 0 1 1 0 1]𝑈, [0 0 0 1 1 0 1 1 1]𝑼) + 𝝌 ([0 0 0 1 1 0 1 1 1]𝑼) P(ු 𝑧=[0 0 1 0 1 1 0 1 1]𝑼) = 𝟑𝟔% P(ු 𝑧=[0 0 0 0 0 1 1 1 1]𝑼) = 𝟒𝟑% P(ු 𝑧=[0 0 0 1 1 0 1 1 1]𝑼) = 𝟓𝟒% ු 𝑧=[0 0 1 0 1 1 0 1 1]𝑼 ු 𝑧=[0 0 0 0 0 1 1 1 1]𝑼 ු 𝑧=[0 0 0 1 1 0 1 1 1]𝑼

Game Matrix for Multi- label prediction

slide-8
SLIDE 8

Marginal entropy

Sample selection strategy

The total expected reduction in uncertainty over all variables, 𝑍

1, . . . , 𝑍 𝑜,

from Observing a particular variable 𝑍

𝑘

slide-9
SLIDE 9

Labeled data pool

Train a model

Unlabeled data pool

∅𝑗, ∅𝑗,𝑘 Test the model Analyze unlabeled data pool

Solicit the sample with the highest 𝑊

𝑘

Y=[? 1 ? ? ? ? ? ? ?] Return the sample if there is any unannotated label. Add/ update the sample Y=[? 1 ? ? ? ? ? ? ?]

Active Learning for Cuts

slide-10
SLIDE 10

a) Bibtex b) Bookmarks c) CAL500 d) Corel5K

Multi-label Experiments

e) Enron f) NUS-WIDE g) TMC2007 h) Yeast

slide-11
SLIDE 11

a) ETH-BAHNHOF b) TUD-CAMPUS c) TUD-STADTMITTE d) ETH-SUN

Tracking Experiments

e) BAHNHOF-PEDCROSS2 f) CAMPUS-STAD g) SUN-PEDCROSS2 h) BAHNHOF-SUN

slide-12
SLIDE 12

Leveraging Adversarial Structured Predictions

➢ Adversarial Robust Cut

Adversary probability distribution correlations between unknown label variables Useful in estimating the value of information for different annotation solicitation decisions. Better performance and lower computational complexity ➢Adversarial Bipartite Matching

Conclusion

slide-13
SLIDE 13

Thank You! Please visit our poster at Pacific Ballroom #264