Outline Naive Credal Classifier 2: an extension of Naive Bayes - PDF document

Introducing NCC2 Experimental Results Conclusions Introducing NCC2 Experimental Results Conclusions Outline Naive Credal Classifier 2: an extension of Naive Bayes Introducing NCC2 1 for delivering robust classifications Background Credal classifiers NCC2 G. Corani M. Zaffalon Experimental Results 2 IDSIA Setup and indicators Switzerland ❣✐♦r❣✐♦④③❛❢❢❛❧♦♥⑥❅✐❞s✐❛✳❝❤ Indeterminate classifications vs posterior probabilities DMIN ’08 Conclusions 3 Naive Credal Classifier 2 G. Corani, M. Zaffalon Naive Credal Classifier 2 G. Corani, M. Zaffalon Introducing NCC2 Experimental Results Conclusions Introducing NCC2 Experimental Results Conclusions Naive Bayes Classifier (NBC) NBC and prior sensitivity Naive assumption (statistical indep. of the features given the class): i = k � NBC computes a single posterior distribution. θ c | f 1 , f 2 ,... f k ∝ θ c θ f i | c i = 1 However, the most probable class might depend on the chosen prior, especially on small data sets . Probability computation Prior-dependent classifications might be fragile. Solution via set of probabilities: θ POST ∝ θ LIKELIHOOD θ PRIOR Robust Bayes Classifier (Ramoni and Sebastiani, 2001) Naive Credal Classifier (Zaffalon, 2001) Maximum likelihood estimators are for instance ˆ θ c = n ( c ) / N and ˆ θ f i | c = n ( f i | c ) / n ( c ) . The choice of any specific prior introduces necessarily some subjectivity. Naive Credal Classifier 2 G. Corani, M. Zaffalon Naive Credal Classifier 2 G. Corani, M. Zaffalon

Introducing NCC2 Experimental Results Conclusions Introducing NCC2 Experimental Results Conclusions Naive Credal Classifier (NCC) (Zaffalon, 2001) Test of dominance and indeterminate classifications Definition Class c 1 dominates c 2 if P ( c 1 ) > P ( c 2 ) in any distribution of the posterior credal set. Extends Naive Bayes to imprecise probabilities; it specifies a set of If no class dominates c 1 , then c 1 is non-dominated. priors by adopting the Imprecise Dirichlet Model . The set of priors is turned into a set of posteriors set via Bayes’ rule. If there are more non-dominated classes, NCC returns all of them: the NCC returns the classes that are non-dominated within the set of classification is indeterminate. posteriors. NCC becomes indeterminate on the instances whose classification would be prior-dependent with NBC. Indeterminate classifications proofed to be viable in real world case studies (e.g., dementia diagnosis). Naive Credal Classifier 2 G. Corani, M. Zaffalon Naive Credal Classifier 2 G. Corani, M. Zaffalon Introducing NCC2 Experimental Results Conclusions Introducing NCC2 Experimental Results Conclusions Incomplete data sets NCC2: NCC with conservative treatment of missing data (I) NCC2 receives the declaration of which features have Non-MAR missing data and at which stage (learning, testing or both). Most classifiers (including NBC) ignore missing data. NCC2 ignores MAR missing data. This is correct only if data are missing-at-random (MAR). NCC2 deals conservatively with Non-MAR missing data. It is not possible to test the MAR hypothesis on the incomplete data. However, ignoring Non-MAR missing data can lead to unreliable Conservative treatment of missing data (learning set) conclusions. All possible completions of missing data are seen as possible. Missing data can be MAR for some features but not for some others; A set of likelihoods is computed. or can be MAR only in training and not in testing (or vice versa). A set of posteriors is computed from a set of priors and a set of likelihoods. The conservative treatment of missing data can generate additional indeterminacy. Naive Credal Classifier 2 G. Corani, M. Zaffalon Naive Credal Classifier 2 G. Corani, M. Zaffalon

Introducing NCC2 Experimental Results Conclusions Introducing NCC2 Experimental Results Conclusions NCC2: NCC with conservative treatment of missing data What to expect from NCC2 (II) By adopting imprecise probabilities, NCC2 is designed to be robust to: Conservative treatment of missing data in the instance to be prior specification, especially critical on small data sets; classified Non-MAR missing data, critical on incomplete data sets. All possible completions of missing data are seen as possible, thus However, excessive indeterminacy is undesirable. giving rise to several virtual instances . Test of dominance: c 1 should dominate c 2 on all the virtual instances. What to expect from indeterminate classifications A procedure allows to find out the dominance relationships without To preserve NCC2 reliability, avoiding too strong conclusions (a single actually building the virtual instances. class) on doubtful instances. Conservative treatment of missing data in the instance to classify can To convey sensible information, dropping unlikely classes. generate additional indeterminacy. Naive Credal Classifier 2 G. Corani, M. Zaffalon Naive Credal Classifier 2 G. Corani, M. Zaffalon Introducing NCC2 Experimental Results Conclusions Introducing NCC2 Experimental Results Conclusions Indicators of performance for NCC2 Indicators for comparing NBC and NCC2 NCC2 NCC2 vs NBC determinacy (% of determinate classifications); NBC(NCC2 D): accuracy of NBC on instances determinately single-accuracy (% of determ. classification that are accurate); classified by NCC2. set-accuracy (% of indeterm. classifications that contain the true NBC(NCC2 I): accuracy of NBC on instances indeterminately class); classified by NCC2. size of indeterminate output , i.e., avg. number of classes returned We expect NBC(NCC2 D) > NBC(NCC I). when indeterminate. Naive Credal Classifier 2 G. Corani, M. Zaffalon Naive Credal Classifier 2 G. Corani, M. Zaffalon

Introducing NCC2 Experimental Results Conclusions Introducing NCC2 Experimental Results Conclusions Experimental setting Results on 18 UCI data sets (10 runs of 10 folds c-v) MAR setup: 5% missing data generated via a MAR mechanism; all 18 UCI complete data sets (numerical features are discretized). features declared as MAR to NCC2. Non-MAR setup: 5% missing data generated via a Non-MAR MAR setting mechanism; all features declared as Non-MAR to NCC2. Each observation (apart from the class) is turned into missing with Average NBC accuracy under both settings: 82%. 5% probability. All features declared MAR for NCC2. NBC vs NCC2 NCC2 NBC (NCC2 D): determinacy: 95%(52%) pippo 85%(95%) single accuracy: Non-MAR setting NBC (NCC2 I ): 85%(95%) Split the categories of each feature into two halves. 36%(69%) set-accuracy: 85%(96%) Turn into missing with probability 5% the observations falling only in On each data set and setup: imprecise output size: the first half values. ∼ NBC(NCC2 D) > NBC(NCC2 I) = 33% of the classes All features declared Non-MAR for NCC2.x Indeterminate classifications do preserve the reliability of NCC2! Naive Credal Classifier 2 G. Corani, M. Zaffalon Naive Credal Classifier 2 G. Corani, M. Zaffalon Introducing NCC2 Experimental Results Conclusions Introducing NCC2 Experimental Results Conclusions NBC Probabilities vs indeterminate classifications NBC Probabilities vs indeterminate classifications (MAR setup, average over all data sets) (Non-MAR setup) Joint analysis over all the data sets: Avg. over 18 data sets 100% Avg. over 18 data sets 100% Accuracy achieved by NBC 80% Accuracy achieved by NBC 80% 60% 60% 40% 40% Determinacy 20% NBC(NCC2 D) NBC(NCC2 I) Determinacy 20% NBC(NCC2 D) 0% NBC(NCC2 I) 60.00% 70.00% 80.00% 90.00% 100.00% Probability estimated by NBC for the class it returns 0% 60.00% 70.00% 80.00% 90.00% 100.00% Probability estimated by NBC for the class it returns Higher posterior probability of NBC → higher NCC2 determinacy. Non-MAR missing data lead to indeterminate classifications even if At any level of posterior probability, NBC(NCC2 D) > NBC(NCC2 I). the probability computed by NBC is high. Striking drop on the instances classified confidently by NBC. At any level of posterior probability, NBC(NCC2 D) > NBC(NCC2 I). Naive Credal Classifier 2 G. Corani, M. Zaffalon Naive Credal Classifier 2 G. Corani, M. Zaffalon

Introducing NCC2 Experimental Results Conclusions Summary NCC2 extends Naive Bayes to imprecise probabilities, to robustly deal with small data sets and missing data. NCC2 becomes indeterminate on instances whose classification is doubtful indeed. Indeterminate classifications preserve the classifier’ reliability while conveying sensible information. Bibliography, software and manuals: see ✇✇✇✳✐❞s✐❛✳❝❤✴⑦❣✐♦r❣✐♦✴❥♥❝❝✷✳❤t♠❧ Software with GUI to arrive soon! Naive Credal Classifier 2 G. Corani, M. Zaffalon

Outline Naive Credal Classifier 2: an extension of Naive Bayes - PDF document

Introducing NCC2 Experimental Results Conclusions Introducing NCC2 Experimental Results Conclusions Outline Naive Credal Classifier 2: an extension of Naive Bayes Introducing NCC2 1 for delivering robust classifications Background Credal

The Multilabel Naive Credal Classifier Alessandro Antonucci and Giorgio Corani {

Active Learning by the Naive Credal Classifier Alessandro Antonucci , Giorgio Corani ,

STAT 339 Naive Bayes Classification 8-10 March 2017 Colin Reimer Dawson Outline Naive Bayes

Naive Bayes and Gaussian Bayes Classifier Ladislav Rampasek slides by Mengye Ren and others

The Nave Bayes Classifier Machine Learning 1 Todays lecture The nave Bayes Classifier

Credal Networks under Epistemic Irrelevance: Theory and Algorithms

Lazy Associative Classification Decision Tree Classifier (Eager) Associative Classifier By

The Extension of Imprecise Probabilities Based on Generalized Credal Sets Andrey G. Bronevich 1 ,

Spam Filtering with Naive Bayes Classifier Yuriy Arabskyy June 6, 2017 Table of contents What

Introduction to Machine Learning Classification: Naive Bayes Learning goals 15 Understand the

Maximum Entropy Classifier Ensembling using Ge- netic Algorithm for NER in Bengali Asif Ekbal 1

Data Mining with Weka Class 2 Lesson 1 Be a classifier! Ian H. Witten Department of Computer

When and Why to use a Classifier? When and Why to use a Classifier? Alan Rector Alan Rector

When and Why to use a Classifier? When and Why to use a Classifier? Alan Rector Alan Rector

Lecture 2: Nearest Neighbour Classifier Aykut Erdem September 2017 Hacettepe University Your

Data Classification Linear Classifier II Latent Differential Analysis Mean Classification

Q1 2020 Presentation 1 2020-04-28 Tomas Carlsson CEO 2 2020-04-28 Strong cashflow and

Element By Presentation ID Name NamespaceData Type Leve Order RoleT Label preferredLabel

Investor Presentation WRG April 2015 Disclaimer FORWARD-LOOKING INFORMATION This presentation

Fourth Quarter 2017 Strategic Update & Financial Results FEBRUARY 28, 2018 Q4-2017 FINANCIAL

Six-month report January - June 2009 Olle Ehrln President and CEO Ann-Sofie Danielsson Chief

Interim report January 1 September 30, 2010 Olle Ehrln President and Chief Executive

I-5 North Coast Corridor Project Phase 1 (CM/GC) Flatiron-Skanska-Stacy and Witbeck (FSSW) FSSW

February 20, 2018 | City Council Workshop Community Engagement Plan (As of Oct 3, 2017) PHASE I -

Outline Naive Credal Classifier 2: an extension of Naive Bayes - PDF document

Introducing NCC2 Experimental Results Conclusions Introducing NCC2 Experimental Results Conclusions Outline Naive Credal Classifier 2: an extension of Naive Bayes Introducing NCC2 1 for delivering robust classifications Background Credal

The Multilabel Naive Credal Classifier Alessandro Antonucci and Giorgio Corani {

Active Learning by the Naive Credal Classifier Alessandro Antonucci , Giorgio Corani ,

STAT 339 Naive Bayes Classification 8-10 March 2017 Colin Reimer Dawson Outline Naive Bayes

Naive Bayes and Gaussian Bayes Classifier Ladislav Rampasek slides by Mengye Ren and others

The Nave Bayes Classifier Machine Learning 1 Todays lecture The nave Bayes Classifier

Credal Networks under Epistemic Irrelevance: Theory and Algorithms

Lazy Associative Classification Decision Tree Classifier (Eager) Associative Classifier By

The Extension of Imprecise Probabilities Based on Generalized Credal Sets Andrey G. Bronevich 1 ,

Spam Filtering with Naive Bayes Classifier Yuriy Arabskyy June 6, 2017 Table of contents What

Introduction to Machine Learning Classification: Naive Bayes Learning goals 15 Understand the

Maximum Entropy Classifier Ensembling using Ge- netic Algorithm for NER in Bengali Asif Ekbal 1

Data Mining with Weka Class 2 Lesson 1 Be a classifier! Ian H. Witten Department of Computer

When and Why to use a Classifier? When and Why to use a Classifier? Alan Rector Alan Rector

When and Why to use a Classifier? When and Why to use a Classifier? Alan Rector Alan Rector

Lecture 2: Nearest Neighbour Classifier Aykut Erdem September 2017 Hacettepe University Your

Data Classification Linear Classifier II Latent Differential Analysis Mean Classification

Q1 2020 Presentation 1 2020-04-28 Tomas Carlsson CEO 2 2020-04-28 Strong cashflow and

Element By Presentation ID Name NamespaceData Type Leve Order RoleT Label preferredLabel

Investor Presentation WRG April 2015 Disclaimer FORWARD-LOOKING INFORMATION This presentation

Fourth Quarter 2017 Strategic Update &amp; Financial Results FEBRUARY 28, 2018 Q4-2017 FINANCIAL

Six-month report January - June 2009 Olle Ehrln President and CEO Ann-Sofie Danielsson Chief

Interim report January 1 September 30, 2010 Olle Ehrln President and Chief Executive

I-5 North Coast Corridor Project Phase 1 (CM/GC) Flatiron-Skanska-Stacy and Witbeck (FSSW) FSSW

February 20, 2018 | City Council Workshop Community Engagement Plan (As of Oct 3, 2017) PHASE I -

Fourth Quarter 2017 Strategic Update & Financial Results FEBRUARY 28, 2018 Q4-2017 FINANCIAL