Probabilistic classification
CE-717: Machine Learning
Sharif University of Technology
- M. Soleymani
Probabilistic classification CE-717: Machine Learning Sharif - - PowerPoint PPT Presentation
Probabilistic classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2016 Topics Probabilistic approach Bayes decision theory Generative models Gaussian Bayes classifier Nave Bayes
Gaussian Bayes classifier Naïve Bayes
Logistic regression
2
3
4
5
6
7
8
𝛽 𝒚 = argmax
𝑧
9
𝐿
10
11
12
13
Choose class with highest 𝑞 𝒟𝑙 𝒚
14
This example has been adopted from Sanja Fidler’s slides, University of Toronto, CSC411
15
16
This example has been adopted from Sanja Fidler’s slides, University of Toronto, CSC411
𝑞 𝑦 𝑧 = 0 𝑞 𝑦 𝑧 = 1
17
18
This example has been adopted from Sanja Fidler’s slides, University of Toronto, CSC411
2
2 = 𝑜: 𝑧(𝑜)=1 𝑦 𝑜 −𝜈1
2
𝑂1
𝑞 𝑦 𝑧 = 0 𝑞 𝑦 𝑧 = 1
19
This example has been adopted from Sanja Fidler’s slides, University of Toronto, CSC411
20
21
22
𝑂
𝑂
𝑜=1 𝑂
23
𝑞(𝒚)
24
26
27
28
29
31
𝑗=1 𝐿 ℛ𝑗
𝑗=1 𝐿 ℛ𝑗
32
𝑧
33
𝑘
If action 𝛽 𝒚 = 𝑗 is taken and the true category is 𝒟𝑘, then the
Zero-one loss function:
35
36
𝒚 is assigned to class 𝒟𝑗 if:
𝐿
𝑘|𝒚
37
High number of parameters
38
𝑗=1 𝑜
39
Example 1: For Gaussian class-conditional density 𝑞 𝒚 𝐷𝑙 , it finds 𝑒 + 𝑒 (mean
2
Example 2: For Bernoulli class-conditional density 𝑞 𝒚 𝐷𝑙 , it finds 𝑒 (mean
40
𝑞 𝐼 = 𝑍𝑓𝑡 = 0.3 𝑞 𝐸 = 𝑍𝑓𝑡 𝐼 = 𝑍𝑓𝑡 =
1 3
𝑞 𝑇 = 𝑍𝑓𝑡 𝐼 = 𝑍𝑓𝑡 =
2 3
𝑞 𝐸 = 𝑍𝑓𝑡 𝐼 = 𝑂𝑝 =
2 7
𝑞 𝑇 = 𝑍𝑓𝑡 𝐼 = 𝑂𝑝 =
2 7
Decision on 𝒚 = [𝑍𝑓𝑡, 𝑍𝑓𝑡] (a person that has diabetes and also smokes):
Diabetes (D) Smoke (S) Heart Disease (H) Y N Y Y N N N Y N N Y N N N N N Y Y N N N N Y Y N N N Y N N
41
Estimate pdf 𝑞(𝒚, 𝒟𝑙) for each class 𝒟𝑙 and then use it to find
or alternatively estimate both pdf 𝑞(𝒚|𝒟𝑙) and 𝑞 𝒟𝑙 to find 𝑞(𝒟𝑙|𝒚)
Directly estimate 𝑞(𝒟𝑙|𝒚) for each class 𝒟𝑙
42
43
44
𝑈𝜯−1𝝂1 + 1
𝑈𝜯−1𝝂2 + ln 𝑞(𝒟1)
45
46
47
𝑔 𝒚; 𝒙 predicts posterior probabilities 𝑄 𝑧 = 1 𝒚
48
𝑔 𝒚; 𝒙 = 0.7 70% chance of tumor being malignant
49
50
51
52
53
54
𝑗=1 𝑜
2
The conditional distribution 𝑞 𝑧|𝒚, 𝒙
The cost function of LR is also convex
55
i.e., 𝑄(𝑧 = 𝑙|𝒚, 𝑿)
𝑙 𝒚 > 𝑔 𝑘 𝒚
56
57
𝑗
(1)
(1)
(𝑜)
(𝑜)
58
59
1 1+𝑓−(𝒙𝑈𝒚)
60
2𝑒 parameters for means 𝑒(𝑒 + 1)/2 parameters for shared covariance matrix one parameter for class prior 𝑞(𝐷1).
61
62