Introduction to Machine Learning Evaluation: Test Error Learning - - PowerPoint PPT Presentation

introduction to machine learning evaluation test error
SMART_READER_LITE
LIVE PREVIEW

Introduction to Machine Learning Evaluation: Test Error Learning - - PowerPoint PPT Presentation

Introduction to Machine Learning Evaluation: Test Error Learning goals training error 0.06 test error Underfitting, Overfitting, High Bias, Low Bias, Understand the definition of test 0.04 Low Variance High Variance MSE error 0.02


slide-1
SLIDE 1

Introduction to Machine Learning Evaluation: Test Error

2 4 6 8 10 0.00 0.02 0.04 0.06 degree of polynomial MSE training error test error Underfitting, High Bias, Low Variance Overfitting, Low Bias, High Variance

Learning goals

Understand the definition of test error Understand how overfitting can be seen in the test error

slide-2
SLIDE 2

TEST ERROR

Learner Fit

Model

Predict

Test Error

Training Dataset Test Dataset

Dataset D

Split into Tain and Test

c

  • Introduction to Machine Learning – 1 / 8
slide-3
SLIDE 3

TEST ERROR AND HOLD-OUT SPLITTING

Split data into 2 parts, e.g., 2/3 for training, 1/3 for testing Evaluate on data not used for model building

Learner Fit

Model

Predict

Test Error

Training Dataset Test Dataset

Dataset D

Split into Tain and Test

c

  • Introduction to Machine Learning – 2 / 8
slide-4
SLIDE 4

TEST ERROR

Let’s consider the following example: Sample data from sinusoidal function 0.5 + 0.4 · sin(2πx) + ǫ

0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00

x y

Train set Test set True function

Try to approximate with a dth-degree polynomial: f(x | θ) = θ0 + θ1x + · · · + θdxd =

d

  • j=0

θjxj.

c

  • Introduction to Machine Learning – 3 / 8
slide-5
SLIDE 5

TEST ERROR

0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00

x y degree

1 3 9 Train set Test set True function

d=1: MSE = 0.038: Clear underfitting d=3: MSE = 0.002: Pretty OK d=9: MSE = 0.046: Clear overfitting

c

  • Introduction to Machine Learning – 4 / 8
slide-6
SLIDE 6

TEST ERROR

Plot evaluation measure for all polynomial degrees:

2 4 6 8 10 0.00 0.02 0.04 0.06 degree of polynomial MSE training error test error Underfitting, High Bias, Low Variance Overfitting, Low Bias, High Variance

Increase model complexity (tendentially) decrease in training error U-shape in test error (first underfit, then overfit, sweet-spot in the middle)

c

  • Introduction to Machine Learning – 5 / 8
slide-7
SLIDE 7

TEST ERROR PROBLEMS

Test data has to be i.i.d. compared to training data. Bias-variance of hold-out: The smaller the training set, the worse the model → biased estimate. The smaller the test set, the higher the variance of the estimate. If the size of our initial, complete data set D is limited, single train-test splits can be problematic.

c

  • Introduction to Machine Learning – 6 / 8
slide-8
SLIDE 8

TEST ERROR PROBLEMS

A major point of confusion: In ML we are in a weird situation. We are usually given one data

  • set. At the end of our model selection and evaluation process we

will likely fit one model on exactly that complete data set. As training error evaluation does not work, we have nothing left to evaluate exactly that model. Hold-out splitting (and resampling) are tools to estimate the future

  • performance. All of the models produced during that phase of

evaluation are intermediate results.

c

  • Introduction to Machine Learning – 7 / 8