An Empirical Study on Lazy Multilabel Classification Algorithms - PowerPoint PPT Presentation

Introduction Lazy Multilabel Algorithms Experimental Setup Experimental Results Conclusions An Empirical Study on Lazy Multilabel Classification Algorithms Eleftherios Spyromitros, Grigorios Tsoumakas and Ioannis Vlahavas Machine Learning & Knowledge Discovery Group Department of Informatics Aristotle University of Thessaloniki Greece Eleftherios Spyromitros, Griogorios Tsoumakas and Ioannis Vlahavas An Empirical Study on Lazy Multilabel Classification Algorithms

Introduction Lazy Multilabel Algorithms Multilabel Classification Experimental Setup Multilabel Classification Methods Experimental Results Conclusions What is Multilabel Classification? • Single-label Classification  • Results are associated with a single label from a set of L disjoint labels L  • If , binary classification | | 2 L  • If , multi-class classification | | 2 • Multilabel Classification  • Results are associated with a set of labels Y L Eleftherios Spyromitros, Griogorios Tsoumakas and Ioannis Vlahavas An Empirical Study on Lazy Multilabel Classification Algorithms

Introduction Lazy Multilabel Algorithms Multilabel Classification Experimental Setup Multilabel Classification Methods Experimental Results Conclusions Data With Multilabel Nature • Traditional • Text Classification • A web article concerning the Antikythera Mechanism Research Project can be categorized into both categorys { Science_Technology, History_Culture } • Medical Diagnosis • Multiple diseases for a patient { Obesity, Hypertension} • Modern • Gene Function Classification • A gene usually has multiple functions { Protein Synthesis, Cellular Biogenesis, Cellular Transport} • Classification of Music into Emotions • A song can make you feel { Sad_Lonely, Quiet_Still} • Semantic Scene Analysis • { Mountain, Trees, Lake } Eleftherios Spyromitros, Griogorios Tsoumakas and Ioannis Vlahavas An Empirical Study on Lazy Multilabel Classification Algorithms

Introduction Lazy Multilabel Algorithms Multilabel Classification Experimental Setup Multilabel Classification Methods Experimental Results Conclusions Types of Multilabel Classification Methods • Problem transformation methods • They transform the learning problem into one (LP) or more (BR) single-label classification or label ranking problems • Algorithm independent • Algorithm adaptation methods • They extend specific algorithms to handle multi-label data • SVM, decision tree, neural network, lazy, Bayesian, boosting Eleftherios Spyromitros, Griogorios Tsoumakas and Ioannis Vlahavas An Empirical Study on Lazy Multilabel Classification Algorithms

Introduction Lazy Multilabel Algorithms Multilabel Classification Experimental Setup Multilabel Classification Methods Experimental Results Conclusions The Binary Relevance (BR) Method • How it works     • Learns one binary classifier for each : { , } h X    different label L • The original dataset is transformed into datasets | | L D   • contains all examples of labeled as if they are D D     associated with and as otherwise • Criticism • Label correlations are not considered Eleftherios Spyromitros, Griogorios Tsoumakas and Ioannis Vlahavas An Empirical Study on Lazy Multilabel Classification Algorithms

Introduction Lazy Multilabel Algorithms Multilabel Classification Experimental Setup Multilabel Classification Methods Experimental Results Conclusions The Label Powerset (LP) Method • How it works • Considers its different subset of as a single label L  • It learns one single-label classifier : ( ) h X P L • Criticism | | • Large number of label subsets ( ) 2 L • Most of these are associated with very few examples Eleftherios Spyromitros, Griogorios Tsoumakas and Ioannis Vlahavas An Empirical Study on Lazy Multilabel Classification Algorithms

Introduction The BRkNN Algorithm Lazy Multilabel Algorithms The Problem of BRkNN Experimental Setup Extensions of BRkNN Experimental Results MLkNN and LPkNN Conclusions The BRkNN Algorithm • Origin • Equivalent to using the BR method in conjunction with the kNN algorithm • Refinement • times faster than BR + kNN in prediction | | L • Avoids the redundant calculations of k nearest neighbors in each one of the transformed datasets D  • A single k nearest neighbors search is followed by independent predictions for each label • Benefit • Applies better in domains with large number of labels and examples, requiring low response times Eleftherios Spyromitros, Griogorios Tsoumakas and Ioannis Vlahavas An Empirical Study on Lazy Multilabel Classification Algorithms

Introduction The BRkNN Algorithm Lazy Multilabel Algorithms The Problem of BRkNN Experimental Setup Extensions of BRkNN Experimental Results MLkNN and LPkNN Conclusions How it works • Confidence scores • BrKNN is based on the calculation of confidence scores for   each label L • Confidence is obtained considering the percentage of the k c  nearest neighbors that include each label • A label is included in the label-set when the percentage is higher than or equal to 50% Eleftherios Spyromitros, Griogorios Tsoumakas and Ioannis Vlahavas An Empirical Study on Lazy Multilabel Classification Algorithms

Introduction The BRkNN Algorithm Lazy Multilabel Algorithms The Problem of BRkNN Experimental Setup Extensions of BRkNN Experimental Results MLkNN and LPkNN Conclusions Independent Predictions… • The weakness 35% scene Percenage of instances, where the enpty set is output • The empty set is a possible overall output yeast 30% emotions • Arises when none of the labels has a confidence higher than 25% 50% • The reason 20% • Independent predictions for each label, a general 15% disadvantage of the BR method 10% • Is this common in BrkNN? 5% 0% 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 Nearest Neighbors Eleftherios Spyromitros, Griogorios Tsoumakas and Ioannis Vlahavas An Empirical Study on Lazy Multilabel Classification Algorithms

Introduction The BRkNN Algorithm Lazy Multilabel Algorithms The Problem of BRkNN Experimental Setup Extensions of BRkNN Experimental Results MLkNN and LPkNN Conclusions The Proposed Extensions • Trying to dissolve the aforementioned problem • BRkNN-a • Checks if BRkNN outputs the empty set • In that case outputs the label with the highest confidence • BRkNN-b • 1 st step: Calculates the average size of the label sets of the k s   1 k nearest neighbors ( ) | | s Y  j 1 j k • 2 nd step: outputs the (nearest integer of s) labels with the [ ] s highest confidence Eleftherios Spyromitros, Griogorios Tsoumakas and Ioannis Vlahavas An Empirical Study on Lazy Multilabel Classification Algorithms

Introduction The BRkNN Algorithm Lazy Multilabel Algorithms The Problem of BRkNN Experimental Setup Extensions of BRkNN Experimental Results MLkNN and LPkNN Conclusions The MLkNN and LPkNN Algorithms • Two more lazy multi-label classification methods • LPkNN • The pairing of LP problem transformation method with the kNN algorithm • A little discussed in the past • MLkNN • An adaptation of kNN for multi-label data • Main difference with BRkNN: prior and posterior probabilities estimated from the training set • Extended with an option for min-max normalization Eleftherios Spyromitros, Griogorios Tsoumakas and Ioannis Vlahavas An Empirical Study on Lazy Multilabel Classification Algorithms

Introduction Lazy Multilabel Algorithms Evaluation Measures Experimental Setup Datasets Evaluation Methodology Experimental Results Conclusions Evaluation Measures • Example-based • Calculate the difference between the actual and predicted label sets for each example • Average the results over all examples of the test set • Label-based • Calculate a binary evaluation measure separately for each label • Micro/Macro averaging operations over all labels Eleftherios Spyromitros, Griogorios Tsoumakas and Ioannis Vlahavas An Empirical Study on Lazy Multilabel Classification Algorithms

Introduction Lazy Multilabel Algorithms Evaluation Measures Experimental Setup Datasets Evaluation Methodology Experimental Results Conclusions Example Based Measures • Notation 2| | Y Z ( , ) • Let be a multi-label example, x Y  | | | | Z Y • Let be a multi-label classifier h  ( , ) • Let be the set of labels predicted by h for x Y ( ) Z h x • Hamming Loss | | Y Z • , where is the symmetric difference of two sets | | L • Classification Accuracy or Subset Accuracy  • 1, if Y Z  • 0, if Y Z • IR-inspired measures 2| | Y Z | | | | Y Z Y Z • Precision , Recall , F-measure  | | | | Z Y | | | | Z Y Eleftherios Spyromitros, Griogorios Tsoumakas and Ioannis Vlahavas An Empirical Study on Lazy Multilabel Classification Algorithms

An Empirical Study on Lazy Multilabel Classification Algorithms - PowerPoint PPT Presentation

Introduction Lazy Multilabel Algorithms Experimental Setup Experimental Results Conclusions An Empirical Study on Lazy Multilabel Classification Algorithms Eleftherios Spyromitros, Grigorios Tsoumakas and Ioannis Vlahavas Machine Learning

Multiclass Multilabel Classification with More Classes than Examples Ohad Shamir Weizmann

Extreme multilabel learning Charles Elkan Amazon Fellow December 12, 2015 1/32 Massive

Active Learning for Sparse Bayesian Multilabel Classification Deepak Vasisht, MIT & IIT Delhi

Can We Represent Infinite Lists? Lazy Evaluation Amtoft Motivation Lazy Lists Conversions

Imagine for a moment @trentmwillis Lazy Loading Engines: Anything But Lazy Engines allow

Lazy v. Yield Incremental, Linear Pretty-printing Oleg Kiselyov Simon Peyton-Jones Amr Sabry

Lazy Exact Deduplication Jingwei Ma, Rebecca J. Stones , Yuxiang Ma, Jingui Wang, Junjie Ren, Gang

Lazy Modules Keiko Nakata Institute of Cybernetics at Tallinn University of Technology

The Multilabel Naive Credal Classifier Alessandro Antonucci and Giorgio Corani {

Functional Principal Component Analysis May 14, 2018 Empirical Principal Component FPC for the

An Empirical Security Study of An Empirical Security Study of the Native Code in the JDK the

Empirical problem solving Statistical method R.W. Oldford Empirical problem solving - PPDAC The

ROOT package management: lazy install approach Brian Bockelman, Oksana Shadura, Vassil

Faster Gaussian Lattice Sampling using Information Leakage Gaussian Sampling Our Work Lazy

Lazy Abstraction with Interpolants Ken McMillan (CAV06) Based on presentation by Yakir Vizel

1 From Stack Traces to Lazy Rewriting Sequences Stephen Chang , Eli Barzilay, John Clements*,

in the Digital Age Dr. Gabriela Avram Definition of learning The act, process, or experience

run leikja dreifu umhverfi lafur Andri Ragnarsson Betware, RU, IGI Revenues Betware

Computational Semantics and Pragmatics Autumn 2013 Raquel Fernndez Institute for Logic,

Recurrent Neural Models: Language Models, and Sequence Prediction and Generation CMSC 473/673

Enhanced library database interface Loh, Hazel.; Nurhazman Abdul Aziz.; Tan, Michael Siew Chye.

Political Economy - The Economic Origins of Democracy February 25, 2013 1/26 Introduction

DISINFORMATION AND DEMOCRACY NCG/Bay Area Democracy Funders May, 2018 Madison Initiative M A D

Markets urban planning and local Markets, urban planning and local democracy By Alain Bertaud

An Empirical Study on Lazy Multilabel Classification Algorithms - PowerPoint PPT Presentation

Introduction Lazy Multilabel Algorithms Experimental Setup Experimental Results Conclusions An Empirical Study on Lazy Multilabel Classification Algorithms Eleftherios Spyromitros, Grigorios Tsoumakas and Ioannis Vlahavas Machine Learning

Multiclass Multilabel Classification with More Classes than Examples Ohad Shamir Weizmann

Extreme multilabel learning Charles Elkan Amazon Fellow December 12, 2015 1/32 Massive

Active Learning for Sparse Bayesian Multilabel Classification Deepak Vasisht, MIT &amp; IIT Delhi

Can We Represent Infinite Lists? Lazy Evaluation Amtoft Motivation Lazy Lists Conversions

Imagine for a moment @trentmwillis Lazy Loading Engines: Anything But Lazy Engines allow

Lazy v. Yield Incremental, Linear Pretty-printing Oleg Kiselyov Simon Peyton-Jones Amr Sabry

Lazy Exact Deduplication Jingwei Ma, Rebecca J. Stones , Yuxiang Ma, Jingui Wang, Junjie Ren, Gang

Lazy Modules Keiko Nakata Institute of Cybernetics at Tallinn University of Technology

The Multilabel Naive Credal Classifier Alessandro Antonucci and Giorgio Corani {

Functional Principal Component Analysis May 14, 2018 Empirical Principal Component FPC for the

An Empirical Security Study of An Empirical Security Study of the Native Code in the JDK the

Empirical problem solving Statistical method R.W. Oldford Empirical problem solving - PPDAC The

ROOT package management: lazy install approach Brian Bockelman, Oksana Shadura, Vassil

Faster Gaussian Lattice Sampling using Information Leakage Gaussian Sampling Our Work Lazy

Lazy Abstraction with Interpolants Ken McMillan (CAV06) Based on presentation by Yakir Vizel

1 From Stack Traces to Lazy Rewriting Sequences Stephen Chang , Eli Barzilay, John Clements*,

in the Digital Age Dr. Gabriela Avram Definition of learning The act, process, or experience

run leikja dreifu umhverfi lafur Andri Ragnarsson Betware, RU, IGI Revenues Betware

Computational Semantics and Pragmatics Autumn 2013 Raquel Fernndez Institute for Logic,

Recurrent Neural Models: Language Models, and Sequence Prediction and Generation CMSC 473/673

Enhanced library database interface Loh, Hazel.; Nurhazman Abdul Aziz.; Tan, Michael Siew Chye.

Political Economy - The Economic Origins of Democracy February 25, 2013 1/26 Introduction

DISINFORMATION AND DEMOCRACY NCG/Bay Area Democracy Funders May, 2018 Madison Initiative M A D

Markets urban planning and local Markets, urban planning and local democracy By Alain Bertaud

Active Learning for Sparse Bayesian Multilabel Classification Deepak Vasisht, MIT & IIT Delhi