Introduction to Machine Learning
- 13. Learning Theory
Geoff Gordon and Alex Smola Carnegie Mellon University
- http://alex.smola.org/teaching/cmu2013-10-701x
10-701
Introduction to Machine Learning 13. Learning Theory Geoff Gordon - - PowerPoint PPT Presentation
Introduction to Machine Learning 13. Learning Theory Geoff Gordon and Alex Smola Carnegie Mellon University http://alex.smola.org/teaching/cmu2013-10-701x 10-701 The Problem Training Data drawn iid
10-701
{(x1, y1), . . . (xm, ym)} p(x, y) l(x, y, f(x)) F = {f : Ω[f] ≤ c} minimize
f∈F
1 m
m
X
i=1
l(xi, yi, f(xi)) E
(x,y)∼p(x,y) [l(x, y, f(x))]
Pr (|ˆ µm − µ| > ✏) ≤ 2 exp ✓ −2m✏2 c2 ◆ .
f ∗ ✏ ≤ L p (log 2/)/2m
Pr (|ˆ µm − µ| > ✏) ≤ 2 exp ✓ −2m✏2 c2 ◆ .
f ∗ ✏ ≤ L p (log 2/)/2m
1.75 3.5 5.25 7 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
17.5 35 52.5 70 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Pr {|Remp[f] − R[f]| > ✏} ≤ X
f 02F
Pr {|Remp[f 0] − R[f 0]| > ✏}
✏ ≤ L r log |F| + log 2/ 2m R[f ∗] ≤ inf
f∈F Remp[f] + L
r log |F| + log 2/δ 2m
✏ ≤ L r log |F| + log 2/ 2m R[f ∗] ≤ inf
f∈F Remp[f] + L
r log |F| + log 2/δ 2m
✏ N(F, ✏) R[f ⇤] ≤ inf
f2F Remp[f] + L
r log N(F, ✏) + log 2/ 2m + L0✏
R[f ∗] ≤ inf
f∈F Remp[f] +
r h(log(2m/h) + 1) + log 4/δ m sin(x/w)
R[f ∗] ≤ inf
f∈F Remp[f] +
r h(log(2m/h) + 1) + log 4/δ m sin(x/w)
Pr (|f(x1, . . . , xm) − EX1,...,Xm[f(x1, . . . , xm)]| > ✏) ≤ 2 exp
.
C2 =
m
X
i=1
c2
i
|f(x1, . . . , xi, . . . , xm) − f(x1, . . . , x0
i, . . . , xm)| ≤ ci
Pr ( sup
f∈F
m
m
X
i=1
l(xi, yi, f(xi)) − E(x,y) [l(x, y, f(x))]
)
Ξ(X, Y ) := sup
f∈F
m
m
X
i=1
l(xi, yi, f(xi)) − E(x,y) [l(x, y, f(x))]
i} , Y i ∪ {y0 i})
Pr {|Ξ(X, Y ) > EX,Y [Ξ(X, Y )]| > ✏} ≤ 2 exp
Ξ(X, Y ) := sup
f∈F
m
m
X
i=1
l(xi, yi, f(xi)) − E(x,y) [l(x, y, f(x))]
i} , Y i ∪ {y0 i})
Pr {|Ξ(X, Y ) > EX,Y [Ξ(X, Y )]| > ✏} ≤ 2 exp
EX,Y " sup
f2F
m
m
X
i=1
l(xi, yi, f(xi)) − E(x,y) [l(x, y, f(x))]
=EX,Y " sup
f2F
m
m
X
i=1
l(xi, yi, f(xi)) − EX0,Y 0 1 m
m
X
i=1
[l(x0
i, y0 i, f(x0 i))]
≤EX,Y,X0,Y 0 " sup
f2F
m
m
X
i=1
[l(xi, yi, f(xi)) − l(x0
i, y0 i, f(x0 i))]
=EX,Y,X0,Y 0Eσ " sup
f2F
m
m
X
i=1
σi[l(xi, yi, f(xi)) − l(x0
i, y0 i, f(x0 i))]
≤ 2 mEX,Y Eσ " sup
f2F m
X
i=1
σil(xi, yi, f(xi)) #
R[f] ≤ Remp[f] + 2R[F, m] + L r log 2/δ 2m