CS 6316 Machine Learning
The Bias-Complexity Tradeoff
Yangfeng Ji
Department of Computer Science University of Virginia
CS 6316 Machine Learning The Bias-Complexity Tradeoff Yangfeng Ji - - PowerPoint PPT Presentation
CS 6316 Machine Learning The Bias-Complexity Tradeoff Yangfeng Ji Department of Computer Science University of Virginia Quiz For a real-world machine learning problem, which of the following items are usually available to us? 1 Quiz For a
Department of Computer Science University of Virginia
1
1
1
1
h′∈HLD(h′) + ǫ
1Sometimes, as hS(x) or h(x, S)
2
2
3
2
3
w1 w2 hS fD ǫ
4
w1 w2 hS fD ǫ
4
h′
5
h′
5
w1 w2 hS fD ǫ
6
w1 w2 hS fD ǫ
6
h′∈H
8
9
10
i N
i N
11
i N
i N
11
i N
i N
h′∈H
N : i ∈ [N]}
11
w1 w2 hS fD h∗ ǫapp ǫest 12
w1 w2 hS fD h∗ ǫapp ǫest
12
w1 w2 hS fD h∗ ǫapp ǫest
13
w1 w2 fD h∗ h∗
13
w1 w2 hS fD h∗ h∗ hS 14
w1 w2 hS fD h∗ h∗ hS
14
15
16
17
18
18
19
example)
◮ replacing linear predictors with nonlinear predictors
19
example)
◮ replacing linear predictors with nonlinear predictors
19
w1 w2 hS fD ǫ
21
w1 w2 hS fD ǫ
21
22
23
1
1 {w0 + w1x : w0, w1 ∈ R}
24
3
3 {w0 + w1x + w2x2 + w3x3 : w0, w1, w2, w3 ∈ R} (14)
25
15
15 {w0+w1x +· · ·+w15x15 : w0, w1, · · · , w15 ∈ R} (15)
26
15
15 {w0+w1x +· · ·+w15x15 : w0, w1, · · · , w15 ∈ R} (15)
26
27
27
m
28
29
29
E
+ {E [h(x, S)] − fD(x)}2 +2E [{h(x, S) − E [h(x, S)]}] · {E [h(x, S)] − fD(x)}
30
E
+ {E [h(x, S)] − fD(x)}2 +2E [{h(x, S) − E [h(x, S)]}] · {E [h(x, S)] − fD(x)}
+ {E [h(x, S)] − fD(x)}2 +2{E [h(x, S)] − E [h(x, S)]} · {E [h(x, S)] − fD(x)}
30
E
+ {E [h(x, S)] − fD(x)}2 +2E [{h(x, S) − E [h(x, S)]}] · {E [h(x, S)] − fD(x)}
+ {E [h(x, S)] − fD(x)}2 +2{E [h(x, S)] − E [h(x, S)]} · {E [h(x, S)] − fD(x)}
+ {E [h(x, S)] − fD(x)}2
30
31
31
31
1: for k 1, · · · , K do 2:
3:
4: end for 5: Output:
K
32
1, we can visualize the bias
33
3
34
15
35
15
35
hypothesis space
36
hypothesis space
set S
36
38
39
39
half {w0 + w1x1 + w2x2 0 : w0, w1, w2 ∈ R}
x1 x2
half can realized all of them.
40
41
41
half) 3
half {w0 + w1x1 + w2x2 0 : w0, w1, w2 ∈ R}
x1 x2
42
half) 3
half {w0 + w1x1 + w2x2 0 : w0, w1, w2 ∈ R}
x1 x2
x1 x2
42
43
43
43
rect) 4
43
half {w0 + w1x1 + w2x2 0 : w0, w1, w2 ∈ R} (23)
x1 x2
44
sin {sin(α · x) : α ∈ R}
−6 −4 −2 2 4 6 −0.5 0.5 1
4
2
45
sin {sin(α · x) : α ∈ R}
−6 −4 −2 2 4 6 −0.5 0.5 1
4
2
45
sin {sin(α · x) : α ∈ R}
−6 −4 −2 2 4 6 −0.5 0.5 1
4
2
45
sin {sin(α · x) : α ∈ R}
−6 −4 −2 2 4 6 −0.5 0.5 1
4
2
sin) ∞
45
Mohri, M., Rostamizadeh, A., and Talwalkar, A. (2018). Foundations of machine learning. MIT press. Shalev-Shwartz, S. and Ben-David, S. (2014). Understanding machine learning: From theory to algorithms. Cambridge university press.
46