Introduction to Machine Learning Classification: Discriminant - - PowerPoint PPT Presentation
Introduction to Machine Learning Classification: Discriminant - - PowerPoint PPT Presentation
Introduction to Machine Learning Classification: Discriminant Analysis compstat-lmu.github.io/lecture_i2ml LINEAR DISCRIMINANT ANALYSIS (LDA) LDA follows a generative approach k ( x ) = P ( y = k | x ) = P ( x | y = k ) P ( y = k ) p ( x | y
LINEAR DISCRIMINANT ANALYSIS (LDA)
LDA follows a generative approach
πk(x) = P(y = k | x) = P(x|y = k)P(y = k) P(x) =
p(x|y = k)πk
g
- j=1
p(x|y = j)πj where we now have to pick a distributional form for p(x|y = k).
c
- Introduction to Machine Learning – 1 / 10
LINEAR DISCRIMINANT ANALYSIS (LDA)
LDA assumes that each class density is modeled as a multivariate Gaussian: p(x|y = k) = 1
(2π)
p 2 |Σ| 1 2
exp
- −1
2(x − µk)TΣ−1(x − µk)
- with equal covariance, i. e. Σk = Σ
∀k.
5 10 15 5 10 15
X1 X2 c
- Introduction to Machine Learning – 2 / 10
LINEAR DISCRIMINANT ANALYSIS (LDA)
Parameters θ are estimated in a straight-forward manner by estimating
ˆ πk =
nk/n, where nk is the number of class k observations
ˆ µk =
1 nk
- i:y(i)=k
x(i)
ˆ Σ =
1 n − g
g
- k=1
- i:y(i)=k
(x(i) − ˆ µk)(x(i) − ˆ µk)T
5 10 15 5 10 15
X1 X2 c
- Introduction to Machine Learning – 3 / 10
LDA AS LINEAR CLASSIFIER
Because of the equal covariance structure of all class-specific Gaussian, the decision boundaries of LDA are linear.
- ●
- ●
- ●
- ●
- 0.0
0.5 1.0 1.5 2.0 2.5 2 4 6
Petal.Length Petal.Width Species
- setosa
versicolor virginica
c
- Introduction to Machine Learning – 4 / 10
LDA AS LINEAR CLASSIFIER
We can formally show that LDA is a linear classifier, by showing that the posterior probabilities can be written as linear scoring functions - up to any isotonic / rank-preserving transformation.
πk(x) = πk · p(x|y = k)
p(x)
= πk · p(x|y = k)
g
- j=1
πj · p(x|y = j)
As the denominator is the same for all classes we only need to consider
πk · p(x|y = k)
and show that this can be written as a linear function of x.
c
- Introduction to Machine Learning – 5 / 10
LDA AS LINEAR CLASSIFIER
πk · p(x|y = k) ∝ πk exp
- − 1
2xTΣ−1x − 1 2µT k Σ−1µk + xTΣ−1µk
- =
exp
- log πk − 1
2µT k Σ−1µk + xTΣ−1µk
- exp
- − 1
2xTΣ−1x
- =
exp
- θ0k + xTθk
- exp
- − 1
2xTΣ−1x
- ∝
exp
- θ0k + xTθk
- by defining θ0k := log πk − 1
2µT k Σ−1µk and θk := Σ−1µk.
We have again left-out all constants which are the same for all classes k, so the normalizing constant of our Gaussians and exp
- − 1
2xTΣ−1x
- )
By finally taking the log, we can write our transformed scores as linear: fk(x) = θ0k + xTθk
c
- Introduction to Machine Learning – 6 / 10
QUADRATIC DISCRIMINANT ANALYSIS (QDA)
QDA is a direct generalization of LDA, where the class densities are now Gaussians with unequal covariances Σk. p(x|y = k) = 1
(2π)
p 2 |Σk| 1 2
exp
- −1
2(x − µk)TΣ−1
k (x − µk)
- Parameters are estimated in a straight-forward manner by:
ˆ πk =
nk n , where nk is the number of class k observations
ˆ µk =
1 nk
- i:y(i)=k
x(i)
ˆ Σk =
1 nk − 1
- i:y(i)=k
(x(i) − ˆ µk)(x(i) − ˆ µk)T
c
- Introduction to Machine Learning – 7 / 10
QUADRATIC DISCRIMINANT ANALYSIS (QDA)
Covariance matrices can differ over classes. Yields better data fit but also requires estimation of more parameters.
5 10 15 5 10 15
X1 X2 c
- Introduction to Machine Learning – 8 / 10
QUADRATIC DISCRIMINANT ANALYSIS (QDA)
πk(x) ∝ πk · p(x|y = k) ∝ πk|Σk|− 1
2 exp(−1
2xTΣ−1
k x − 1
2µT
k Σ−1 k µk + xTΣ−1 k µk)
Taking the log of the above, we can define a discriminant function that is quadratic in x.
log πk − 1
2 log |Σk| − 1 2µT
k Σ−1 k µk + xTΣ−1 k µk − 1
2xTΣ−1
k x
c
- Introduction to Machine Learning – 9 / 10
QUADRATIC DISCRIMINANT ANALYSIS (QDA)
- ●
- ●
- ●
- ●
- 0.0
0.5 1.0 1.5 2.0 2.5 2 4 6
Petal.Length Petal.Width Species
- setosa
versicolor virginica
c
- Introduction to Machine Learning – 10 / 10