Introduction to Machine Learning Evaluation: Training Error - - PowerPoint PPT Presentation

introduction to machine learning evaluation training error
SMART_READER_LITE
LIVE PREVIEW

Introduction to Machine Learning Evaluation: Training Error - - PowerPoint PPT Presentation

Introduction to Machine Learning Evaluation: Training Error compstat-lmu.github.io/lecture_i2ml TRAINING ERROR (also: apparent error / resubstitution error) Learner Dataset D Fit Model Predict Dataset D Train Error c Introduction


slide-1
SLIDE 1

Introduction to Machine Learning Evaluation: Training Error

compstat-lmu.github.io/lecture_i2ml

slide-2
SLIDE 2

TRAINING ERROR

(also: apparent error / resubstitution error)

Learner Fit

Model

Predict

Train Error

Dataset D Dataset D

c

  • Introduction to Machine Learning – 1 / 4
slide-3
SLIDE 3

EXAMPLE: POLYNOMIAL REGRESSION

Sample data from sinusoidal function 0.5 + 0.4 · sin(2πx) + ǫ with measurement error ǫ.

  • 0.00

0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00

x y

  • Train set

True function

Assume data generating process unknown. Try to approximate with a dth-degree polynomial: f(x | θ) = θ0 + θ1x + · · · + θdxd =

d

  • j=0

θjxj.

c

  • Introduction to Machine Learning – 2 / 4
slide-4
SLIDE 4

EXAMPLE: POLYNOMIAL REGRESSION

Models of different complexity, i.e., of different orders of the polynomial are fitted. How should we choose d?

  • 0.00

0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00

x y

True function

degree

1 3 9

  • Train set

d=1: MSE = 0.036: Clear underfitting d=3: MSE = 0.003: Pretty OK? d=9: MSE = 0.001: Clear overfitting Simply using the training error seems to be a bad idea.

c

  • Introduction to Machine Learning – 3 / 4
slide-5
SLIDE 5

TRAINING ERROR PROBLEMS

Unreliable and overly optimistic estimator of future performance. E.g. training error of 1-NN is always zero, as each observation is its own NN during test time. Goodness-of-fit measures like (classical) R2, likelihood, AIC, BIC, deviance are all based on the training error. For models of restricted capacity, and given enough data, the training error may provide reliable information. E.g. LM with p = 5 features, 106 training points. But: impossible to determine when training error becomes unreliable.

c

  • Introduction to Machine Learning – 4 / 4