Classification, Linear Models, Naïve Bayes
CMSC 470 Marine Carpuat
Slides credit: Dan Jurafsky & James Martin, Jacob Eisenstein
Nave Bayes CMSC 470 Marine Carpuat Slides credit: Dan Jurafsky - - PowerPoint PPT Presentation
Classification, Linear Models, Nave Bayes CMSC 470 Marine Carpuat Slides credit: Dan Jurafsky & James Martin, Jacob Eisenstein Today Text classification problems and their evaluation Linear classifiers Features &
Slides credit: Dan Jurafsky & James Martin, Jacob Eisenstein
label1 label2 label3 label4 Classifier supervised machine learning algorithm
?
unlabeled document label1? label2? label3? label4?
Testing Training
training data
Feature Functions
From: "Fabian Starr“ <Patrick_Freeman@pamietaniepeerelu.pl> Subject: Hey! Sofware for the funny prices! Get the great discounts on popular software today for PC and Macintosh http://iiled.org/Cj4Lmx 70-90% Discounts from retail price!!! All sofware is instantly available to download - No Need Wait!
Inhibitors
MeSH Subject Category Hierarchy
MEDLINE Article
correct not correct selected tp fp not selected fn tn
Recall: % of correct items that are selected
correct not correct selected tp fp not selected fn tn
(weighted harmonic mean):
F = 2PR/(P+R)
R P PR R P F + + =
=
2 2
) 1 ( 1 ) 1 ( 1 1 b b a a
Feature function representation Weights
Feature function representation Weights
are generated
Score(x,y) Definition of conditional probability Generative story assumptions This is a linear model!
Score(x,y) Definition of conditional probability Generative story assumptions This is a linear model!
Score(x,y) Definition of conditional probability Generative story assumptions This is a linear model!
𝑧 = 𝑏𝑠𝑛𝑏𝑦𝑧 𝑄 𝑍 = 𝑧 𝑌 = 𝑦) = 𝑏𝑠𝑛𝑏𝑦𝑧𝑄(𝑍 = 𝑧)𝑄 𝑌 = 𝑦 𝑍 = 𝑧) = 𝑏𝑠𝑛𝑏𝑦𝑧𝑄(𝑍 = 𝑧)
𝑗=1 𝑒
𝑄 𝑌𝑗 = 𝑦𝑗 𝑍 = 𝑧) Bayes rule + Conditional independence assumption