the multilabel naive credal classifier
play

The Multilabel Naive Credal Classifier Alessandro Antonucci and - PowerPoint PPT Presentation

The Multilabel Naive Credal Classifier Alessandro Antonucci and Giorgio Corani { alessandro,giorgio } @idsia.ch Istituto Dalle Molle di Studi sullIntelligenza Artificiale - Lugano (Switzerland) http://ipg.idsia.ch ISIPTA 15, Pescara,


  1. The Multilabel Naive Credal Classifier Alessandro Antonucci and Giorgio Corani { alessandro,giorgio } @idsia.ch Istituto “Dalle Molle” di Studi sull’Intelligenza Artificiale - Lugano (Switzerland) http://ipg.idsia.ch ISIPTA ’15, Pescara, July 21st, 2015

  2. IPG ⊂ IDSIA ⊂ USI ∪ SUPSI ⊂ LUGANO

  3. IPG ⊂ IDSIA ⊂ USI ∪ SUPSI ⊂ LUGANO

  4. IPG ⊂ IDSIA ⊂ USI ∪ SUPSI ⊂ LUGANO

  5. IPG ⊂ IDSIA ⊂ USI ∪ SUPSI ⊂ LUGANO University of Applied Sciences and Arts of Southern Switzerland ( supsi.ch ) Universit` a della Svizzera Italiana ( usi.ch )

  6. IPG ⊂ IDSIA ⊂ USI ∪ SUPSI ⊂ LUGANO

  7. Chronology (Acknowledgements) Credal version of the naive Bayes classifier by Marco (Zaffalon) MAP algs for imprecise HMMs by Jasper (De Bock) & Gert (de Cooman) ISIPTA ’01 ISIPTA ’11 IJCAI-13 NIPS 14 ISIPTA ’15 Bayes nets as multilabel classifiers by Denis (Mau´ a) & us MAP in generic credal nets by Jasper & Cassio (de Campos) & me A credal classifiers based on MAP tasks in credal nets by us

  8. Chronology (Acknowledgements) Credal version of the naive Bayes classifier by Marco (Zaffalon) MAP algs for imprecise HMMs by Jasper (De Bock) & Gert (de Cooman) ISIPTA ’01 ISIPTA ’11 IJCAI-13 NIPS 14 ISIPTA ’15 Bayes nets as multilabel classifiers by Denis (Mau´ a) & us MAP in generic credal nets by Jasper & Cassio (de Campos) & me A credal classifiers based on MAP tasks in credal nets by us

  9. Chronology (Acknowledgements) Credal version of the naive Bayes classifier by Marco (Zaffalon) MAP algs for imprecise HMMs by Jasper (De Bock) & Gert (de Cooman) ISIPTA ’01 ISIPTA ’11 IJCAI-13 NIPS 14 ISIPTA ’15 Bayes nets as multilabel classifiers by Denis (Mau´ a) & us MAP in generic credal nets by Jasper & Cassio (de Campos) & me A credal classifiers based on MAP tasks in credal nets by us

  10. Chronology (Acknowledgements) Credal version of the naive Bayes classifier by Marco (Zaffalon) MAP algs for imprecise HMMs by Jasper (De Bock) & Gert (de Cooman) ISIPTA ’01 ISIPTA ’11 IJCAI-13 NIPS 14 ISIPTA ’15 Bayes nets as multilabel classifiers by Denis (Mau´ a) & us MAP in generic credal nets by Jasper & Cassio (de Campos) & me A credal classifiers based on MAP tasks in credal nets by us

  11. Chronology (Acknowledgements) Credal version of the naive Bayes classifier by Marco (Zaffalon) MAP algs for imprecise HMMs by Jasper (De Bock) & Gert (de Cooman) ISIPTA ’01 ISIPTA ’11 IJCAI-13 NIPS 14 ISIPTA ’15 Bayes nets as multilabel classifiers by Denis (Mau´ a) & us MAP in generic credal nets by Jasper & Cassio (de Campos) & me A credal classifiers based on MAP tasks in credal nets by us

  12. Single- vs. multi-label classification A (fictious) classifier to detect eyes color SINGLE-LABEL Possible classes C := { brown , green , blue } Heterochromia iridum : two (or more) colors Possible values in 2 C , a multilabel task! Trivial approaches Standard classification over the power set Exponential in the number of labels! C = green Each label as a separate Boolean variable a (standard) classifier for each label MULTI-LABEL Ignored relations among classes ! Graphical models (GMs) to depict relations among class labels (and features) Classification as (standard) inference in GMs C = { blue , brown }

  13. Credal classifiers are not (yet) multilabel classifiers Class variable C and (discrete) features F , a test instance ˜ f Standard (single-label) classifier are maps: F → C learn P ( C , F ) from data and return c ∗ := arg max c ∈C P ( c , ˜ f ) Multi-label classifiers: F → 2 C C = ( C 1 , . . . , C n ) as an array of Boolean vars, one for each label learn P ( C , F ) and solve the MAP task c ∗ := arg max c ∈{ 0 , 1 } n P ( c , ˜ f ) Credal (single-label) classifiers: F → 2 C learn credal set K ( C , F ) and return all c ′′ ∈ C s.t. ∄ c ′ : P ( c ′ , ˜ f ) > P ( c ′′ , ˜ f ) ∀ P ( C , F ) ∈ K ( C , F ) Multilabel credal classifier (MCC): F → 2 2 C learn credal set K ( C , F ) and return all sequences c ′′ s.t. ∄ c ′ : P ( c ′ , ˜ f ) > P ( c ′′ , ˜ f ) ∀ P ( C , F ) ∈ K ( C , F )

  14. Compact Representation of the Output Output of a MCC might be exponentially large Jasper & Gert’s idea to fix this with imprecise HMMs (Viterbi): decide whether or not there is at least an optimal sequence sucht that a variable is in a particular state (for each variable and state) With MCCs, for each class label, we can decide whether: the class is active for all the optimal sequences the class is inactive fro all the optimal sequences there are optimal sequences with the label active, and others with the label inactive Optimization task P ( c ′ , f ) min l =0 / 1 max inf P ( c ′′ , f ) ≤ 1 c ′′ : c ′′ c ′ P ( C , F ) ∈ K ( C , F ) O (2 treewidth ) for separately specified credal nets (e.g., local IDM) More complex with non-separate specifications

  15. C F 1 F 2 F m . . . NBC

  16. C F 1 F 2 F m . . . NCC=NBC+IDM

  17. C 1 Multi-label? Naive topology over classes C 2 C n . . . F 1 F 2 F m . . . Structural learning to bound # of parents of the features and to select the super-class C 1

  18. F 1 F 1 F 1 . . . m 1 2 Features replicated: tree topology F n F n F n C 1 m 1 2 . . . C 2 C n MNBC . . . F 2 F 2 F 2 . . . m 1 2

  19. F 1 F 1 F 1 . . . m 1 2 Features replicated: tree topology F n F n F n C 1 m 1 2 . . . C 2 C n MNBC . . . + IDM F 2 F 2 F 2 = MNCC . . . m 1 2

  20. During the poster session I can Explain some detail about the learning of the structure Explain the feature replication trick (tis makes inference simpler) Explain the non-separate IDM-based quantification of the model Explain the detail of the (convex) optimization . . .

  21. MNCC: the algorithm Input: test instance f (+ dataset D ) / Output initialized: C 1 C 2 C n . . . active 0 0 0 . . . inactive 0 0 0 . . . for l = 1 , . . . , n do for c l = 0 , 1 do P t ( c ′ , f ) l = c l max c ′ inf t if min c ′′ : c ′′ P t ( c ′′ , f ) ≤ 1 then Output( l , c l )=1 end if end for end for linear representation of a (exponential) number of maximal seqs 1 1 1 0 0 1 0 1

  22. Testing MNCC Preliminary tests on real-world datasets Data set Classes Features Instances Emotions 6 44/72 593 Scene 6 224/294 2407 E-mobility 10 14/18 4226 Slashdot 22 496/1079 3782 Perfomance described by: % of instance s.t. all maximal seqs all in the same state Accuracy of the precise model when MNCC is determinate Accuracy of the precise model when MNCC is indeterminate

  23. 1.00 .75 .50 .25 0 C 1 C 2 C 3 C 4 C 5 C 6 Emotions

  24. 1.00 .75 .50 .25 0 C 1 C 2 C 3 C 4 C 5 C 6 Scene

  25. C 1 C 2 C 3 C 4 C 5 C 6 C 7 C 8 C 9 C 10 0 .25 .50 .75 1.00 E-mobility

  26. C 15 C 2 C 13 C 20 C 5 C 6 C 8 C 17 C 3 C 7 C 14 C 11 C 18 C 16 C 10 C 4 C 9 C 19 C 22 C 21 C 12 C 1 0 .25 .50 .75 1.00 Slashdot

  27. Conclusions, Outlooks and Acks Among the first tools for robust multilabel classification Still lots of things to do: Extension to multidimensional/hierarchical case Extension to continuous variables (features) Extension to continuous class (multi-target interval-valued regression) More complex topologies (ETAN, de Campos, 2014) Variational approach to features replication Not only 0/1 losses (imprecise losses?)

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend