15-780 – Graduate Artificial Intelligence: Machine learning
- J. Zico Kolter (this lecture) and Nihar Shah
Carnegie Mellon University Spring 2020
1
15-780 Graduate Artificial Intelligence: Machine learning J. Zico - - PowerPoint PPT Presentation
15-780 Graduate Artificial Intelligence: Machine learning J. Zico Kolter (this lecture) and Nihar Shah Carnegie Mellon University Spring 2020 1 Outline What is machine learning? Linear regression Linear classification Nonlinear methods
1
2
3
4
5
8 5
6
7
8
Da Date te Hi High Tempera rature (F) Peak k Demand (GW) 2011-06-01 84.0 2.651 2011-06-02 73.0 2.081 2011-06-03 75.2 1.844 2011-06-04 84.9 1.959 … … …
9
10
11 55 60 65 70 75 80 85 90 95 100
High Temperature (F)
1.4 1.6 1.8 2.0 2.2 2.4 2.6 2.8 3.0 3.2
Peak Demand (GW)
Observed days Prediction
푗=1 푛
12
13
휃
푖=1 푚
14
휃
푖=1 푚
휃
푖=1 푚
2
15
푖=1 푚
2 = ∑ 푖=1 푚
2
푖=1 푚
푖=1 푚
16
푇
푇
푇
푖=1 푚
2 = 𝑌𝜄 − 𝑧 2 2
2 = 2𝑌푇 (𝑌𝜄 − 𝑧)
17
2 = 0
18
19
20
휃
푖=1 푚
휃
21
22
Po Poll: which solution is which? 1. Green is squared loss, red is absolute 2. Red is squared loss, green is absolute 3. Those lines look identical to me
23
24
25
26
27
28
29
30
31
32
ℓ0/1 = 1 𝑧 ⋅ ℎ휃 𝑦 ≤ 0 ℓlogistic = log 1 + exp −𝑧 ⋅ ℎ휃 𝑦 ℓhinge = max{1 − 𝑧 ⋅ ℎ휃 𝑦 , 0} ℓexp = exp(−𝑧 ⋅ ℎ휃 𝑦 )
휃
푖=1 푚
푖=1 푚
33
휃
푖=1 푚
푖=1 푚
34
35
𝜄 = 1.456 1.848 −0.189
36
휃
푖=1 푚
푖=1 푚
37
38
39
푖
푗=1 푘
40
41
42
43
44
2
45
46
47
48
49
50
푖=1 푚
51
52
53
54
55
56
푖=1 푚
57
58
59
Holdout / validation set (e.g. 30%) Training set (e.g. 70%) All data
60
61
62
63
64
65
Fold 1 All data Fold 2 Fold 𝑙 …
66
67
푖=1 푚
2
68
2 + 𝜇
2
2 + 𝜇
2
69
70
71
72
73
74
Loss Number of samples Training Testing Desired performance
Validation
Loss Number of samples Training Testing Desired performance
75 Validation
76
77
78
79
Test set (e.g., 30%) Training set (e.g. 50%) All data Holdout / validation set (e.g. 20%)
80