Model Selection and Nave Bayes Machine Learning - 10601 Geoff - PDF document

9/23/2009 Model Selection and Naïve Bayes Machine Learning - 10601 Geoff Gordon, Miroslav Dudík ([[[partly based on slides of Tom Mitchell] http://www.cs.cmu.edu/~ggordon/10601/ September 23, 2009 Announcements September 21,2009: Netflix awards $1 Million prize to a team of statisticians, machine-learning experts and computer engineers “You’re getting Ph.D.’s for a dollar an hour,” Reed Hastings, chief of Netflix, said of the people competing for the prize. 1

9/23/2009 How to win $1 Million Goal: (user,movie) -> rating Data: 100M (user,movie,date,rating) tuples Performance measure: root mean squared error on withheld test set How to win $1 Million A part of the winning model is the “baseline model” capturing bulk of the information: [Koren 2009] 2

9/23/2009 How to win $1 Million training set quiz set test set FAQ: why quiz/test split? We wanted a way of informing you … about your progress … while making it difficult for you to simply train and optimize against “the answer oracle” 3

9/23/2009 FAQ: why quiz/test split? Two goals for withholding data • model selection • model assessment training set validation set test set What if data is scarce? 4

9/23/2009 Cross-validation • split data randomly into K equal parts Part 1 Part 2 Part 3 • for each model setting: evaluate avg performance across K train-test splits evaluate error Part 2 Part 3 Part 1 evaluate error Part 3 Part 1 Part 2 evaluate error • train the best model on the full data set The best model… Depends on the size of the data set: y ≈ w 0 + w 1 x + w 2 x 2 + w 3 x 3 + w 4 x 4 + … + w 10 x 10 5

9/23/2009 K -fold cross-validation trains on of the training data Controlling model complexity • limit the number of features • add a “complexity penalty” 6

9/23/2009 Regularized estimation min error train (w) + regularization(w) min -log p(data|w) - log p(w) Examples of regularization min min 7

9/23/2009 training error training regulari + error zation regularization L 2 : L 1 : L 1 vs L 2 L 1 • sparse solutions • more suitable when #features much larger than training set L 2 • computationally better-behaved How do you choose λ ? 8

9/23/2009 Announcements HW #3 out due October 7 Classification Goal: learn a map h: x y Data: ( x 1 , y 1 ), ( x 2 , y 2 )… , ( x N , y N ) Performance measure: 9

9/23/2009 All you need to know is p(X,Y) … If you knew p(X,Y) , how would you classify an example x ? Why? How many parameters need to be estimated? Y binary X described by M binary features X 1 ,X 2 ,…,X M Data: p(X,Y) described by numbers 10

9/23/2009 Naïve Bayes Assumption • features of X conditionally independent given class Y Example: Live in Sq Hill? • S=1 iff live in Sq Hill • D=1 iff drive to CMU • G=1 iff shop in Sq Hill Giant Eagle • A=1 iff owns a Mac 11

9/23/2009 Naïve Bayes Assumption • usually incorrect… • Naïve Bayes often performs well, even when the assumption is violated [see Domingos-Pazzani 1996] Learning to classify text documents • which emails are spam? • which emails promise an attachment? • which web pages are student home pages? What are the features of X ? 12

9/23/2009 Feature X j is the j th word Assumption #1: Naïve Bayes 13

9/23/2009 Assumption #2: “Bag of words” “Bag of words” approach 14

9/23/2009 15

9/23/2009 16

9/23/2009 17

9/23/2009 What you should know about Naïve Bayes Naïve Bayes • assumption • why we use it Text classification • bag of words model Gaussian Naïve Bayes • each feature a Gaussian given the class 18

Model Selection and Nave Bayes Machine Learning - 10601 Geoff - PDF document

9/23/2009 Model Selection and Nave Bayes Machine Learning - 10601 Geoff Gordon, Miroslav Dudk ([[[partly based on slides of Tom Mitchell] http://www.cs.cmu.edu/~ggordon/10601/ September 23, 2009 Announcements September 21,2009: Netflix

Naive Bayes and Gaussian Bayes Classifier Ladislav Rampasek slides by Mengye Ren and others

The Nave Bayes Classifier Machine Learning 1 Todays lecture The nave Bayes Classifier

Bayes Theorem Thomas Bayes (1701-1761) Simple form of Bayes Theorem, for

Cognitive Modeling Unseen Examples 2 Bayes Classifiers Lecture 14: Naive Bayes Classifiers

DATA MINING: NAVE BAYES 1 Nave Bayes Classifier Thomas Bayes 1702 - 1761 We will start off

STAT 339 Naive Bayes Classification 8-10 March 2017 Colin Reimer Dawson Outline Naive Bayes

Bayes Classifiers Nave Bayes Classification Patrick Mair Bayes Classifiers Weather data

I ntroduction to Mobile Robotics Bayes Filter Kalm an Filter Wolfram Burgard 1 Bayes

Bayesian Model Selection and Averaging Nonlinear Models Bayes factors Example Families FFX

ERP Selection KIRTANE & PANDIT Suhas Deshpande Why ERP Selection is important ?

STAT 213 Model Selection II Colin Reimer Dawson Oberlin College March 30, 2018 1 / 13 Outline

Bayes meets Dijkstra Exact Inference by Program Verification Joost-Pieter Katoen Dagstuhl

Introduction to Machine Learning Classification: Naive Bayes Learning goals 15 Understand the

Formal Modeling in Cognitive Science Independence Lecture 23: Conditional Probability; Bayes

Arthur Berg Pennsylvania State University Introduction Bayes Estimation Empirical Bayes

Nave Bayes Classification Nickolai Riabov, Kenneth Tiong Brown University Fall 2013 Nickolai

Efficient Photonic Coding Yury Audzevich, Philip Watts, Sean Kilmurray, Andrew W. Moore

NUCLEAR POTENTIALS IN QCD AND THEIR EXTENSIONS Sinya Aoki, Yukawa Institute for Theoretical

J-PARC E14 K O TO CsI 03/25/2011 Eito IWAI,

Mining online data for public health surveillance Vasileios Lampos ( a.k.a. Bill ) Computer

Meta-algorithmic techniques in SAT solving: Automated configuration, selection and beyond

2019-2020 A TE T eaching Dossier Workshop Academy of T eaching Excellence (ATE) University

Portfolio-based Algorithm Selection for SAT Holger H. Hoos BETA Lab Department of Computer

From Stochastic Search to Programming by Optimisation: My Quest for Automating the Design of

Model Selection and Nave Bayes Machine Learning - 10601 Geoff - PDF document

9/23/2009 Model Selection and Nave Bayes Machine Learning - 10601 Geoff Gordon, Miroslav Dudk ([[[partly based on slides of Tom Mitchell] http://www.cs.cmu.edu/~ggordon/10601/ September 23, 2009 Announcements September 21,2009: Netflix

Naive Bayes and Gaussian Bayes Classifier Ladislav Rampasek slides by Mengye Ren and others

The Nave Bayes Classifier Machine Learning 1 Todays lecture The nave Bayes Classifier

Bayes Theorem Thomas Bayes (1701-1761) Simple form of Bayes Theorem, for

Cognitive Modeling Unseen Examples 2 Bayes Classifiers Lecture 14: Naive Bayes Classifiers

DATA MINING: NAVE BAYES 1 Nave Bayes Classifier Thomas Bayes 1702 - 1761 We will start off

STAT 339 Naive Bayes Classification 8-10 March 2017 Colin Reimer Dawson Outline Naive Bayes

Bayes Classifiers Nave Bayes Classification Patrick Mair Bayes Classifiers Weather data

I ntroduction to Mobile Robotics Bayes Filter Kalm an Filter Wolfram Burgard 1 Bayes

Bayesian Model Selection and Averaging Nonlinear Models Bayes factors Example Families FFX

ERP Selection KIRTANE &amp; PANDIT Suhas Deshpande Why ERP Selection is important ?

STAT 213 Model Selection II Colin Reimer Dawson Oberlin College March 30, 2018 1 / 13 Outline

Bayes meets Dijkstra Exact Inference by Program Verification Joost-Pieter Katoen Dagstuhl

Introduction to Machine Learning Classification: Naive Bayes Learning goals 15 Understand the

Formal Modeling in Cognitive Science Independence Lecture 23: Conditional Probability; Bayes

Arthur Berg Pennsylvania State University Introduction Bayes Estimation Empirical Bayes

Nave Bayes Classification Nickolai Riabov, Kenneth Tiong Brown University Fall 2013 Nickolai

Efficient Photonic Coding Yury Audzevich, Philip Watts, Sean Kilmurray, Andrew W. Moore

NUCLEAR POTENTIALS IN QCD AND THEIR EXTENSIONS Sinya Aoki, Yukawa Institute for Theoretical

J-PARC E14 K O TO CsI 03/25/2011 Eito IWAI,

Mining online data for public health surveillance Vasileios Lampos ( a.k.a. Bill ) Computer

Meta-algorithmic techniques in SAT solving: Automated configuration, selection and beyond

2019-2020 A TE T eaching Dossier Workshop Academy of T eaching Excellence (ATE) University

Portfolio-based Algorithm Selection for SAT Holger H. Hoos BETA Lab Department of Computer

From Stochastic Search to Programming by Optimisation: My Quest for Automating the Design of

ERP Selection KIRTANE & PANDIT Suhas Deshpande Why ERP Selection is important ?