Learning From Data Lecture 11 Overfitting
What is Overfitting When does Overfitting Occur Stochastic and Deterministic Noise
- M. Magdon-Ismail
CSCI 4100/6100
Learning From Data Lecture 11 Overfitting What is Overfitting - - PowerPoint PPT Presentation
Learning From Data Lecture 11 Overfitting What is Overfitting When does Overfitting Occur Stochastic and Deterministic Noise M. Magdon-Ismail CSCI 4100/6100 recap: Nonlinear Transforms X -space is R d d Z -space is R 1 1
What is Overfitting When does Overfitting Occur Stochastic and Deterministic Noise
CSCI 4100/6100
recap: Nonlinear Transforms
xn ∈ X
zn = Φ(xn) ∈ Z
g(x) = ˜ g(Φ(x)) = sign( ˜ wtΦ(x))
˜ g(z) = sign( ˜ wtz) X -space is Rd Z-space is R
˜ d
x = 1 x1 . . . xd z = Φ(x) = 1 Φ1(x) . . . Φ ˜
d(x)
= 1 z1 . . . z ˜
d
x1, x2, . . . , xN z1, z2, . . . , zN y1, y2, . . . , yN y1, y2, . . . , yN no weights ˜ w = w0 w1 . . . w ˜
d
dvc = d + 1 dvc = d + 1 g(x) = sign( ˜ wtΦ(x))
c A M L Creator: Malik Magdon-Ismail
Overfitting: 2 /25
Digits data − →
recap: Digits Data “1” Versus “All”
c A M L Creator: Malik Magdon-Ismail
Overfitting: 3 /25
Superstitions − →
c A M L Creator: Malik Magdon-Ismail
Overfitting: 4 /25
Simple illustration − →
c A M L Creator: Malik Magdon-Ismail
Overfitting: 5 /25
Classic overfitting − →
c A M L Creator: Malik Magdon-Ismail
Overfitting: 6 /25
What is overfitting? − →
c A M L Creator: Malik Magdon-Ismail
Overfitting: 7 /25
Is it bad generalization? − →
c A M L Creator: Malik Magdon-Ismail
Overfitting: 8 /25
Beyond bad generalization − →
c A M L Creator: Malik Magdon-Ismail
Overfitting: 9 /25
Case study: simple and complex f − →
← − special case of linear models with feature transform x → (1, x, x2, · · · ).
c A M L Creator: Malik Magdon-Ismail
Overfitting: 10 /25
H2 versus H10 − →
← − special case of linear models with feature transform x → (1, x, x2, · · · ).
c A M L Creator: Malik Magdon-Ismail
Overfitting: 11 /25
H2 wins for both cases − →
simple noisy target 2nd Order 10th Order Ein 0.050 0.034 Eout 0.127 9.00 complex noiseless target 2nd Order 10th Order Ein 0.029 10−5 Eout 0.120 7680
c A M L Creator: Malik Magdon-Ismail
Overfitting: 12 /25
Is there really no noise − →
c A M L Creator: Malik Magdon-Ismail
Overfitting: 13 /25
Look only at the data − →
c A M L Creator: Malik Magdon-Ismail
Overfitting: 14 /25
Learning curves for H2, H10 − →
c A M L Creator: Malik Magdon-Ismail
Overfitting: 15 /25
Overfit measure σ2 vs. N − →
80 100 120
0.1 0.2 1 2
c A M L Creator: Malik Magdon-Ismail
Overfitting: 16 /25
Overfit measure Qf vs. N − →
80 100 120
0.1 0.2 1 2
80 100 120
0.1 0.2 25 50 75 100
c A M L Creator: Malik Magdon-Ismail
Overfitting: 17 /25
Define ‘noise’ − →
c A M L Creator: Malik Magdon-Ismail
Overfitting: 18 /25
Stochastic noise − →
no one can model this
c A M L Creator: Malik Magdon-Ismail
Overfitting: 19 /25
Deterministic noise − →
H cannot model this
best approximation to f in H
c A M L Creator: Malik Magdon-Ismail
Overfitting: 20 /25
Both hurt learning − →
x y f(x) y = f(x)+stoch. noise
stochastic noise changes.
stochastic noise the same.
x y h∗ y = h∗(x)+det. noise
deterministic noise the same.
deterministic noise changes.
c A M L Creator: Malik Magdon-Ismail
Overfitting: 21 /25
Stochastic noise and bias-var − →
measurement error
c A M L Creator: Malik Magdon-Ismail
Overfitting: 22 /25
bias-var-σ2 and noise − →
measurement error
stochastic deterministic indirect noise noise impact
c A M L Creator: Malik Magdon-Ismail
Overfitting: 23 /25
Noise causes overfitting − →
Learning is led astray by fitting the noise more than the signal
c A M L Creator: Malik Magdon-Ismail
Overfitting: 24 /25
Regularization teaser − →
c A M L Creator: Malik Magdon-Ismail
Overfitting: 25 /25
Regularization teaser − →
c A M L Creator: Malik Magdon-Ismail
Overfitting: 26 /25