introduction to machine learning evaluation training error
play

Introduction to Machine Learning Evaluation: Training Error - PowerPoint PPT Presentation

Introduction to Machine Learning Evaluation: Training Error compstat-lmu.github.io/lecture_i2ml TRAINING ERROR (also: apparent error / resubstitution error) Learner Dataset D Fit Model Predict Dataset D Train Error c Introduction


  1. Introduction to Machine Learning Evaluation: Training Error compstat-lmu.github.io/lecture_i2ml

  2. TRAINING ERROR (also: apparent error / resubstitution error) Learner Dataset D Fit Model Predict Dataset D Train Error � c Introduction to Machine Learning – 1 / 4

  3. EXAMPLE: POLYNOMIAL REGRESSION Sample data from sinusoidal function 0 . 5 + 0 . 4 · sin( 2 π x ) + ǫ with measurement error ǫ . 1.00 ● ● ● ● 0.75 ● Train set ● ● 0.50 y ● ● ● ● 0.25 ● True function ● ● 0.00 0.00 0.25 0.50 0.75 1.00 x Assume data generating process unknown. Try to approximate with a d th-degree polynomial: d f ( x | θ ) = θ 0 + θ 1 x + · · · + θ d x d = � θ j x j . j = 0 � c Introduction to Machine Learning – 2 / 4

  4. EXAMPLE: POLYNOMIAL REGRESSION Models of different complexity , i.e., of different orders of the polynomial are fitted. How should we choose d ? degree True function 1 3 9 ● Train set 1.00 ● ● ● ● 0.75 ● ● 0.50 y ● ● ● ● 0.25 ● ● ● 0.00 0.00 0.25 0.50 0.75 1.00 x d=1: MSE = 0.036: Clear underfitting d=3: MSE = 0.003: Pretty OK? d=9: MSE = 0.001: Clear overfitting Simply using the training error seems to be a bad idea. � c Introduction to Machine Learning – 3 / 4

  5. TRAINING ERROR PROBLEMS Unreliable and overly optimistic estimator of future performance. E.g. training error of 1-NN is always zero, as each observation is its own NN during test time. Goodness-of-fit measures like (classical) R 2 , likelihood, AIC, BIC, deviance are all based on the training error. For models of restricted capacity, and given enough data, the training error may provide reliable information. E.g. LM with p = 5 features, 10 6 training points. But: impossible to determine when training error becomes unreliable. � c Introduction to Machine Learning – 4 / 4

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend