Classification
Machine Learning and Pattern Recognition Chris Williams
School of Informatics, University of Edinburgh
October 2014
(All of the slides in this course have been adapted from previous versions by Charles Sutton, Amos Storkey, David Barber.)
1 / 19
This Lecture
◮ Now we focus on classfication. We’ve already seen the naive
Bayes classifier. This time:
◮ An alternative classification family, discriminative methods
◮ Logistic regression (this time) ◮ Neural networks (coming soon)
◮ Pros and cons of generative and discriminative methods ◮ Reading: Murphy ch 8 up to 8.3.1, §8.4 (not all sections),
§8.6.1; Barber 17.4 up to 17.4.1, 17.4.4, 13.2.3
2 / 19
Discriminative Classification
◮ So far, generative methods for classification. These models
look like p(y, x) = p(x|y)p(y). To classify use Bayes’s Rule to get p(y|x).
◮ Generative assumption: classes exist because data from each
class drawn from two different distributions.
◮ Next we will will use a discriminative approach. Now model
p(y|x) directly. This is a conditional approach. Don’t bother modelling p(x).
◮ Probabilistically, each class label is drawn dependent on the
value of x.
◮ Generative: Class → Data. p(x|y) ◮ Discriminative: Data → Class. p(y|x)
3 / 19
Logistic Regression
◮ Conditional Model ◮ Linear Model ◮ For Classification
4 / 19