Week 2 Video 5 Cross-Validation and Over-Fitting Over-Fitting Ive - - PowerPoint PPT Presentation

week 2 video 5
SMART_READER_LITE
LIVE PREVIEW

Week 2 Video 5 Cross-Validation and Over-Fitting Over-Fitting Ive - - PowerPoint PPT Presentation

Week 2 Video 5 Cross-Validation and Over-Fitting Over-Fitting Ive mentioned over-fitting a few times during the last few weeks Fitting to the noise as well as the signal Over-Fitting 25 25 20 20 15 15 10 10 5 5 0 0 0 5 10


slide-1
SLIDE 1

Cross-Validation and Over-Fitting

Week 2 Video 5

slide-2
SLIDE 2

Over-Fitting

¨ I’ve mentioned over-fitting a few times during the

last few weeks

¨ Fitting to the noise as well as the signal

slide-3
SLIDE 3

Over-Fitting

5 10 15 20 25 5 10 15 20 25 5 10 15 20 25 5 10 15 20 25

Good fit Over fit

slide-4
SLIDE 4

Reducing Over-Fitting

¨ Use simpler models

¤ Fewer variables (BiC, AIC, Occam’s Razor) ¤ Less complex functions (MDL)

slide-5
SLIDE 5

Eliminating Over-Fitting?

¨ Every model is over-fit in some fashion ¨ The questions are:

¤ How bad? ¤ What is it over-fit to?

slide-6
SLIDE 6

Assessing Generalizability

¨ Does your model transfer to new contexts? ¨ Or is it over-fit to a specific context?

slide-7
SLIDE 7

Training Set/Test Set

¨ Split your data into a training set and test set

slide-8
SLIDE 8

Notes

¨ Model tested on unseen data ¨ But uses data unevenly

slide-9
SLIDE 9

Cross-validation

¨ Split data points into N equal-size groups

9

slide-10
SLIDE 10

Cross-validation

¨ Train on all groups but one, test on last group ¨ For each possible combination

10

slide-11
SLIDE 11

Cross-validation

¨ Train on all groups but one, test on last group ¨ For each possible combination

11

slide-12
SLIDE 12

Cross-validation

¨ Train on all groups but one, test on last group ¨ For each possible combination

12

slide-13
SLIDE 13

Cross-validation

¨ Train on all groups but one, test on last group ¨ For each possible combination

13

slide-14
SLIDE 14

Cross-validation

¨ Train on all groups but one, test on last group ¨ For each possible combination

14

slide-15
SLIDE 15

Cross-validation

¨ Train on all groups but one, test on last group ¨ For each possible combination

15

slide-16
SLIDE 16

You can do both!

¨ Use cross-validation to tune algorithm parameters

  • r select algorithms

¨ Use held-out test set to get less over-fit final

estimate of model goodness

slide-17
SLIDE 17

How many groups?

¨ K-fold

¤ Pick a number K, split into this number of groups

¨ Leave-out-one

¤ Every data point is a fold

slide-18
SLIDE 18

How many groups?

¨ K-fold

¤ Pick a number K, split into this number of groups ¤ Quicker; preferred by some theoreticians

¨ Leave-out-one

¤ Every data point is a fold ¤ More stable ¤ Avoids issue of how to select folds (stratification issues)

slide-19
SLIDE 19

Cross-validation variants

¨ Flat Cross-Validation

¤ Each point has equal chance of being placed into each

fold

¨ Stratified Cross-Validation

¤ Biases fold selection so that some variable is equally

represented in each fold

¤ The variable you’re trying to predict ¤ Or some variable that is thought to be an important

context

slide-20
SLIDE 20

Student-level cross-validation

¨ Folds are selected so that no student’s data is

represented in two folds

¨ Allows you to test model generalizability to new

students

¨ As opposed to testing model generalizability to new

data from the same students

slide-21
SLIDE 21

Student-level cross-validation

¨ Usually seen as the minimum cross-validation

needed, in the EDM conference

¨ Papers that don’t pay attention to this issue are

usually rejected

¤ OK to explicitly choose something else and discuss that

choice

¤ Not OK to just ignore the issue and do what’s easiest

slide-22
SLIDE 22

Student-level cross-validation

¨ Easy to do with Batch X-Validation in RapidMiner

slide-23
SLIDE 23

Other Levels Sometimes Used for Cross-Validation

¨ Lesson/Content ¨ School ¨ Demographic (Urban/Rural/Suburban, Race,

Gender)

¨ Software Package ¨ Session (in MOOCs, behavior in later sessions

differs from behavior in earlier sessions – Whitehill et al., 2017)

slide-24
SLIDE 24

Important Consideration

¨ Where do you want to be able to use your model?

¤ New students? ¤ New schools? ¤ New populations? ¤ New software content?

¨ Make sure to cross-validate at that level

slide-25
SLIDE 25

Next Lecture

¨ More on Generalization and Validity