A meta-learning system for multi-instance classification Gitte - - PowerPoint PPT Presentation

a meta learning system for multi instance classification
SMART_READER_LITE
LIVE PREVIEW

A meta-learning system for multi-instance classification Gitte - - PowerPoint PPT Presentation

A meta-learning system for multi-instance classification Gitte Vanwinckelen and Hendrik Blockeel KU Leuven, Belgium Motivation Performed extensive evaluation of multi-instance (MI) learners on datasets from different domains


slide-1
SLIDE 1

A meta-learning system for multi-instance classification

Gitte Vanwinckelen and Hendrik Blockeel KU Leuven, Belgium

slide-2
SLIDE 2

Motivation

  • Performed extensive evaluation of multi-instance

(MI) learners on datasets from different domains

  • Performance of MI algorithms is very sensitive to

the application domain

  • Can we formalize this knowledge by learning a

meta-model?

slide-3
SLIDE 3

Outline

1) Motivation 2) What is multi-instance learning? 3) Design principles of meta-model 4) Performance evaluation of mi-learners 5) Meta-learning results 6) Conclusion

slide-4
SLIDE 4

MI learning

slide-5
SLIDE 5

Relationship instances – bag

  • Traditional mi learning

– At least one postive instance in a bag – Learn a concept that describes all positive

instances (or bags)

  • Generalized mi learning

– All instances in a bag contribute to its label – Learn a concept that identifies the positive

bags

slide-6
SLIDE 6

Standard multi-instance learning

Drug activity prediction Identifying musky molecule configurations

[Dietterich, Artificial Intelligence 1997]

slide-7
SLIDE 7

Generalized multi-instance learning

[J. Amores, Artificial Intelligence '13]

Which bags describe a beach ?

slide-8
SLIDE 8

Meta-learning

  • Which learner performs best on which MI dataset?
  • Construct meta-features from original learning tasks
  • Learn a model on meta-dataset (decision tree)
  • Nb attributes, size train sets, correlation with
  • utput , ...
  • Landmarkers: Fast algorithms [Pfahringer '00]
  • Indicate performance of expensive algorithms
slide-9
SLIDE 9

Meta-learning with landmarking

  • Reduce MI datasets to single-instance datasets

based on different MI assumptions

  • Standard MI assumption

– Label instances with bag label – One-sided noisy dataset

  • Collective assumption

– All instances contribute equally to the bag label – Average features values over all instances in a bag

slide-10
SLIDE 10

MI experiments: Datasets

  • SIVAL image classification, CBIR (25)
  • Synthetic newsgroups, text classification (20)
  • Binary classification UCI datasets (27)

– adult, tictactoe,diabetes,transfusion,spam – Iid sampled to create bags – Bag configurations: ½, ⅓, ¼, …

  • Evaluation: Area Under ROC curve (AUC)
slide-11
SLIDE 11

MI experiments: Algorithms

  • Decision trees: SimpleMI-J48, MIWrapper-J48, Adaboost-MITI
  • Rule inducer MIRI
  • Nearest neighbors: CitationKNN
  • OptimalBall
  • Diverse Density: MDD, EM-DD, MIDD
  • TLD
  • Support Vector Machines: mi-SVM, MISMO (NSK)
  • Logistic regression: MILR, MILR-C
slide-12
SLIDE 12

Performance overview MI algorithms

  • Comparison of classifiers over multiple datasets [Demsar '06]
  • Are performance differences statistically significant?
  • Friedman test with post-hoc Nemenyi test

– Ranking of algorithms for each dataset – Average ranks over datasets same domain – Hypothesis test that algorithms perform equally good – Nemenyi test identify statistically equivalent groups of classifiers

  • Critical difference diagram
slide-13
SLIDE 13

Critical difference diagrams (AUC)

Text UCI CBIR

slide-14
SLIDE 14

Meta-learning setup

  • 14 learners

binary classification tasks for all → combinations of learners (one vs one)

  • Leave-one-out cross-validation
  • Three dataset domains (CBIR, text, UCI datasets)
  • Landmarkers (standard and collective assumption):

– Naive Bayes – 1 nearest neighbors – Logistic regression – Decision stump

slide-15
SLIDE 15

UCI Metamodel based on number of features and noise level

Majority classifier wins Meta-model wins

slide-16
SLIDE 16

UCI metamodel: Landmarker approach

Standard MI landmarkers Collective MI landmarkers Dstump, NB, 1NN, LR Majority classifier wins Meta-model wins

slide-17
SLIDE 17

CBIR metamodel: Landmarker approach

Standard MI landmarkers Collective MI landmarkers Majority classifier wins Meta-model wins

slide-18
SLIDE 18

Relationship landmarkers:

logistic regression

CBIR UCI Text

slide-19
SLIDE 19

Conclusions and future work

  • Demonstration large differences MI learner

evaluation on different domains

  • Not sufficient to evaluate on multiple datasets from

same domain

  • Larger meta-dataset needed
  • Define alternative MI assumptions and translate to

SI datasets

– e.g. Meta-data assumption (NSK)