Overfitting and Regularization March 31, 2020 Data Science CSCI - - PowerPoint PPT Presentation

overfitting and regularization
SMART_READER_LITE
LIVE PREVIEW

Overfitting and Regularization March 31, 2020 Data Science CSCI - - PowerPoint PPT Presentation

Overfitting and Regularization March 31, 2020 Data Science CSCI 1951A Brown University Instructor: Ellie Pavlick HTAs: Josh Levin, Diane Mutako, Sol Zitter 1 Announcements Office Hourswatch calendar ML assignment out later today


slide-1
SLIDE 1

Overfitting and Regularization

March 31, 2020 Data Science CSCI 1951A Brown University Instructor: Ellie Pavlick HTAs: Josh Levin, Diane Mutako, Sol Zitter

1

slide-2
SLIDE 2

Announcements

  • Office Hours—watch calendar
  • ML assignment out later today
  • Analysis project deliverable out soon
slide-3
SLIDE 3

Today

  • Overfitting and Regularization
slide-4
SLIDE 4

Train/Test Splits

  • By definition, trained models are minimizing their
  • bjective for the data they see, but not for the data

they don’t see

  • What we really care about is how the model does
  • n data we don’t see
  • So we split our training data into disjoin sets—a

train set and a test set—and assess performance

  • n test given parameters set using train.
slide-5
SLIDE 5

Train/Test Splits

5

slide-6
SLIDE 6

Train/Test Splits

Train

6

slide-7
SLIDE 7

Train/Test Splits

Train MSE = 6

7

slide-8
SLIDE 8

Train/Test Splits

Test

8

slide-9
SLIDE 9

Train/Test Splits

Test MSE = 12

9

slide-10
SLIDE 10

Train/Test Splits

Train MSE = 4

Problem gets worse as models get more powerful/flexible

10

slide-11
SLIDE 11

Train/Test Splits

MSE = 14

Problem gets worse as models get more powerful/flexible

11

slide-12
SLIDE 12

Cross Validation

accs = [] for i in range(num_folds): train, test = random.split(data) clf.fit(train) accs.append(clf.score(test))

  • Some train/test splits are harder than others
  • To get a more stable estimate of test

performance, we can use cross validation

slide-13
SLIDE 13

Overfitting

  • Models are likely to overfit when the model is more

“complex” than is needed to explain the variation we care about

  • “Complex” generally means the number of

parameters (i.e. features) is high

  • When the number of parameters is >= the number
  • f observations, you can trivially memorize your

training data, often without learning anything generalizable to test time

slide-14
SLIDE 14

Regularization

  • Incur a cost for including more features (more non-zero

weights), or for assuming features are very important (more higher weights)

  • Or “early stopping”—for iterative training procedures (i.e.

gradient descent) stop before the model has fully converged (i.e. you assume the final steps are spent memorizing noise)

  • By definition regularization will make your model worse

during training…

  • But hopefully better at test (which is what you really care

about)

slide-15
SLIDE 15

Regularization

  • Adds an extra “hyperparameter” which controls

how much you penalize

minθ

  • loss(x; θ) + λcost(θ)
  • <latexit sha1_base64="W8ULAvdO+AKQ8XWFOrbuSfWxYPM=">ACJHicbVDLSgMxFM3UV62vUZdugkVoEcqMCArdFN24rGAf0Cklk6ZtaCYZkjtiKf0YN/6KGxc+cOHGbzFtZ6GtBwKHc85Nck8YC27A876czMrq2vpGdjO3tb2zu+fuH9SNSjRlNaqE0s2QGCa4ZDXgIFgz1oxEoWCNcHg9Rv3TBu5B2MYtaOSF/yHqcErNRxyxGXnQAGDAgOQt4vCGVM4aGM51oRn+JA2Ou6BFNloJDK02ix4+a9kjcDXiZ+SvIoRbXjvgdRZOISaCGNPyvRjaY6KBU8EmuSAxLCZ0SPqsZakETPt8WzJCT6xShf3lLZHAp6pvyfGJDJmFIU2GREYmEVvKv7ntRLoXbHXMYJMEnD/USgUHhaWO4yzWjIEaWEKq5/SumA6IJBdtrzpbgL68TOpnJd8r+bfn+cpVWkcWHaFjVEA+ukAVdIOqIYoekTP6BW9OU/Oi/PhfM6jGSedOUR/4Hz/AOuvo7Y=</latexit><latexit sha1_base64="W8ULAvdO+AKQ8XWFOrbuSfWxYPM=">ACJHicbVDLSgMxFM3UV62vUZdugkVoEcqMCArdFN24rGAf0Cklk6ZtaCYZkjtiKf0YN/6KGxc+cOHGbzFtZ6GtBwKHc85Nck8YC27A876czMrq2vpGdjO3tb2zu+fuH9SNSjRlNaqE0s2QGCa4ZDXgIFgz1oxEoWCNcHg9Rv3TBu5B2MYtaOSF/yHqcErNRxyxGXnQAGDAgOQt4vCGVM4aGM51oRn+JA2Ou6BFNloJDK02ix4+a9kjcDXiZ+SvIoRbXjvgdRZOISaCGNPyvRjaY6KBU8EmuSAxLCZ0SPqsZakETPt8WzJCT6xShf3lLZHAp6pvyfGJDJmFIU2GREYmEVvKv7ntRLoXbHXMYJMEnD/USgUHhaWO4yzWjIEaWEKq5/SumA6IJBdtrzpbgL68TOpnJd8r+bfn+cpVWkcWHaFjVEA+ukAVdIOqIYoekTP6BW9OU/Oi/PhfM6jGSedOUR/4Hz/AOuvo7Y=</latexit><latexit sha1_base64="W8ULAvdO+AKQ8XWFOrbuSfWxYPM=">ACJHicbVDLSgMxFM3UV62vUZdugkVoEcqMCArdFN24rGAf0Cklk6ZtaCYZkjtiKf0YN/6KGxc+cOHGbzFtZ6GtBwKHc85Nck8YC27A876czMrq2vpGdjO3tb2zu+fuH9SNSjRlNaqE0s2QGCa4ZDXgIFgz1oxEoWCNcHg9Rv3TBu5B2MYtaOSF/yHqcErNRxyxGXnQAGDAgOQt4vCGVM4aGM51oRn+JA2Ou6BFNloJDK02ix4+a9kjcDXiZ+SvIoRbXjvgdRZOISaCGNPyvRjaY6KBU8EmuSAxLCZ0SPqsZakETPt8WzJCT6xShf3lLZHAp6pvyfGJDJmFIU2GREYmEVvKv7ntRLoXbHXMYJMEnD/USgUHhaWO4yzWjIEaWEKq5/SumA6IJBdtrzpbgL68TOpnJd8r+bfn+cpVWkcWHaFjVEA+ukAVdIOqIYoekTP6BW9OU/Oi/PhfM6jGSedOUR/4Hz/AOuvo7Y=</latexit><latexit sha1_base64="W8ULAvdO+AKQ8XWFOrbuSfWxYPM=">ACJHicbVDLSgMxFM3UV62vUZdugkVoEcqMCArdFN24rGAf0Cklk6ZtaCYZkjtiKf0YN/6KGxc+cOHGbzFtZ6GtBwKHc85Nck8YC27A876czMrq2vpGdjO3tb2zu+fuH9SNSjRlNaqE0s2QGCa4ZDXgIFgz1oxEoWCNcHg9Rv3TBu5B2MYtaOSF/yHqcErNRxyxGXnQAGDAgOQt4vCGVM4aGM51oRn+JA2Ou6BFNloJDK02ix4+a9kjcDXiZ+SvIoRbXjvgdRZOISaCGNPyvRjaY6KBU8EmuSAxLCZ0SPqsZakETPt8WzJCT6xShf3lLZHAp6pvyfGJDJmFIU2GREYmEVvKv7ntRLoXbHXMYJMEnD/USgUHhaWO4yzWjIEaWEKq5/SumA6IJBdtrzpbgL68TOpnJd8r+bfn+cpVWkcWHaFjVEA+ukAVdIOqIYoekTP6BW9OU/Oi/PhfM6jGSedOUR/4Hz/AOuvo7Y=</latexit>
slide-16
SLIDE 16

Dev/Validation Sets

  • Often you need to make meta-decisions (not just set the

parameters), E.g.

  • Which model is better (i.e. generalizes better to held out data)?
  • What regularization to use?
  • How many training iterations?
  • Do do this, you have to split into train/dev/test, not just train/dev. If

you use test to set these parameters, you are “peaking” at unseen data in order to fit the model, and thus test performance is no longer actually representative of how you would do in the real world

slide-17
SLIDE 17

Norms

  • L1 norm: encourages sparsity
  • L2 norm: more stable
  • Lp norm:

l1 = X

i

|xi|

<latexit sha1_base64="iV4wmelmObujAL5Q8TPaMjUlndc=">AB+3icbVBNSwMxEJ31s9avtR69BIvgqeyKoBeh6MVjBfsB7bJk02wbmSXJCstbf+KFw+KePWPePfmLZ70NYHA4/3ZpiZF6WcaeN5387a+sbm1nZhp7i7t39w6B6VGjrJFKF1kvBEtSKsKWeS1g0znLZSRbGIOG1Gg7uZ3yiSrNEPpRSgOBe5LFjGBjpdAt8dBHN6ijMxEyNBmGbBK6Za/izYFWiZ+TMuSohe5Xp5uQTFBpCMdat30vNcEYK8MIp9NiJ9M0xWSAe7RtqcSC6mA8v32KzqzSRXGibEmD5urviTEWo9EZDsFNn297M3E/7x2ZuLrYMxkmhkqyWJRnHFkEjQLAnWZosTwkSWYKGZvRaSPFSbGxlW0IfjL6+SxkXF9yr+w2W5epvHUYATOIVz8OEKqnAPNagDgSE8wyu8OVPnxXl3Phata04+cwx/4Hz+ANX0k6k=</latexit><latexit sha1_base64="iV4wmelmObujAL5Q8TPaMjUlndc=">AB+3icbVBNSwMxEJ31s9avtR69BIvgqeyKoBeh6MVjBfsB7bJk02wbmSXJCstbf+KFw+KePWPePfmLZ70NYHA4/3ZpiZF6WcaeN5387a+sbm1nZhp7i7t39w6B6VGjrJFKF1kvBEtSKsKWeS1g0znLZSRbGIOG1Gg7uZ3yiSrNEPpRSgOBe5LFjGBjpdAt8dBHN6ijMxEyNBmGbBK6Za/izYFWiZ+TMuSohe5Xp5uQTFBpCMdat30vNcEYK8MIp9NiJ9M0xWSAe7RtqcSC6mA8v32KzqzSRXGibEmD5urviTEWo9EZDsFNn297M3E/7x2ZuLrYMxkmhkqyWJRnHFkEjQLAnWZosTwkSWYKGZvRaSPFSbGxlW0IfjL6+SxkXF9yr+w2W5epvHUYATOIVz8OEKqnAPNagDgSE8wyu8OVPnxXl3Phata04+cwx/4Hz+ANX0k6k=</latexit><latexit sha1_base64="iV4wmelmObujAL5Q8TPaMjUlndc=">AB+3icbVBNSwMxEJ31s9avtR69BIvgqeyKoBeh6MVjBfsB7bJk02wbmSXJCstbf+KFw+KePWPePfmLZ70NYHA4/3ZpiZF6WcaeN5387a+sbm1nZhp7i7t39w6B6VGjrJFKF1kvBEtSKsKWeS1g0znLZSRbGIOG1Gg7uZ3yiSrNEPpRSgOBe5LFjGBjpdAt8dBHN6ijMxEyNBmGbBK6Za/izYFWiZ+TMuSohe5Xp5uQTFBpCMdat30vNcEYK8MIp9NiJ9M0xWSAe7RtqcSC6mA8v32KzqzSRXGibEmD5urviTEWo9EZDsFNn297M3E/7x2ZuLrYMxkmhkqyWJRnHFkEjQLAnWZosTwkSWYKGZvRaSPFSbGxlW0IfjL6+SxkXF9yr+w2W5epvHUYATOIVz8OEKqnAPNagDgSE8wyu8OVPnxXl3Phata04+cwx/4Hz+ANX0k6k=</latexit><latexit sha1_base64="iV4wmelmObujAL5Q8TPaMjUlndc=">AB+3icbVBNSwMxEJ31s9avtR69BIvgqeyKoBeh6MVjBfsB7bJk02wbmSXJCstbf+KFw+KePWPePfmLZ70NYHA4/3ZpiZF6WcaeN5387a+sbm1nZhp7i7t39w6B6VGjrJFKF1kvBEtSKsKWeS1g0znLZSRbGIOG1Gg7uZ3yiSrNEPpRSgOBe5LFjGBjpdAt8dBHN6ijMxEyNBmGbBK6Za/izYFWiZ+TMuSohe5Xp5uQTFBpCMdat30vNcEYK8MIp9NiJ9M0xWSAe7RtqcSC6mA8v32KzqzSRXGibEmD5urviTEWo9EZDsFNn297M3E/7x2ZuLrYMxkmhkqyWJRnHFkEjQLAnWZosTwkSWYKGZvRaSPFSbGxlW0IfjL6+SxkXF9yr+w2W5epvHUYATOIVz8OEKqnAPNagDgSE8wyu8OVPnxXl3Phata04+cwx/4Hz+ANX0k6k=</latexit>

l2 = sX

i

x2

i

<latexit sha1_base64="OhDaq7DQE40PuW62nYTke30IDVw=">ACAnicbVDLSsNAFL3xWesr6krcDBbBVUmKoBuh6MZlBfuAJobJdNIOnUnizEQsobjxV9y4UMStX+HOv3H6WGjrgQuHc+7l3nvClDOlHefbWlhcWl5ZLawV1zc2t7btnd2GSjJaJ0kPJGtECvKWUzrmlOW6mkWIScNsP+5chv3lOpWBLf6EFKfYG7MYsYwdpIgb3Pgwo6R56kzr3VCYCh4CdlsZBnbJKTtjoHniTkJpqgF9pfXSUgmaKwJx0q1XSfVfo6lZoTYdHLFE0x6eMubRsaY0GVn49fGKIjo3RQlEhTsUZj9fdEjoVSAxGaToF1T816I/E/r53p6MzPWZxmsZksijKONIJGuWBOkxSovnAEwkM7ci0sMSE21SK5oQ3NmX50mjUnadsnt9UqpeTOMowAEcwjG4cApVuIa1IHAIzDK7xZT9aL9W59TFoXrOnMHvyB9fkDLsOWpg=</latexit><latexit sha1_base64="OhDaq7DQE40PuW62nYTke30IDVw=">ACAnicbVDLSsNAFL3xWesr6krcDBbBVUmKoBuh6MZlBfuAJobJdNIOnUnizEQsobjxV9y4UMStX+HOv3H6WGjrgQuHc+7l3nvClDOlHefbWlhcWl5ZLawV1zc2t7btnd2GSjJaJ0kPJGtECvKWUzrmlOW6mkWIScNsP+5chv3lOpWBLf6EFKfYG7MYsYwdpIgb3Pgwo6R56kzr3VCYCh4CdlsZBnbJKTtjoHniTkJpqgF9pfXSUgmaKwJx0q1XSfVfo6lZoTYdHLFE0x6eMubRsaY0GVn49fGKIjo3RQlEhTsUZj9fdEjoVSAxGaToF1T816I/E/r53p6MzPWZxmsZksijKONIJGuWBOkxSovnAEwkM7ci0sMSE21SK5oQ3NmX50mjUnadsnt9UqpeTOMowAEcwjG4cApVuIa1IHAIzDK7xZT9aL9W59TFoXrOnMHvyB9fkDLsOWpg=</latexit><latexit sha1_base64="OhDaq7DQE40PuW62nYTke30IDVw=">ACAnicbVDLSsNAFL3xWesr6krcDBbBVUmKoBuh6MZlBfuAJobJdNIOnUnizEQsobjxV9y4UMStX+HOv3H6WGjrgQuHc+7l3nvClDOlHefbWlhcWl5ZLawV1zc2t7btnd2GSjJaJ0kPJGtECvKWUzrmlOW6mkWIScNsP+5chv3lOpWBLf6EFKfYG7MYsYwdpIgb3Pgwo6R56kzr3VCYCh4CdlsZBnbJKTtjoHniTkJpqgF9pfXSUgmaKwJx0q1XSfVfo6lZoTYdHLFE0x6eMubRsaY0GVn49fGKIjo3RQlEhTsUZj9fdEjoVSAxGaToF1T816I/E/r53p6MzPWZxmsZksijKONIJGuWBOkxSovnAEwkM7ci0sMSE21SK5oQ3NmX50mjUnadsnt9UqpeTOMowAEcwjG4cApVuIa1IHAIzDK7xZT9aL9W59TFoXrOnMHvyB9fkDLsOWpg=</latexit><latexit sha1_base64="OhDaq7DQE40PuW62nYTke30IDVw=">ACAnicbVDLSsNAFL3xWesr6krcDBbBVUmKoBuh6MZlBfuAJobJdNIOnUnizEQsobjxV9y4UMStX+HOv3H6WGjrgQuHc+7l3nvClDOlHefbWlhcWl5ZLawV1zc2t7btnd2GSjJaJ0kPJGtECvKWUzrmlOW6mkWIScNsP+5chv3lOpWBLf6EFKfYG7MYsYwdpIgb3Pgwo6R56kzr3VCYCh4CdlsZBnbJKTtjoHniTkJpqgF9pfXSUgmaKwJx0q1XSfVfo6lZoTYdHLFE0x6eMubRsaY0GVn49fGKIjo3RQlEhTsUZj9fdEjoVSAxGaToF1T816I/E/r53p6MzPWZxmsZksijKONIJGuWBOkxSovnAEwkM7ci0sMSE21SK5oQ3NmX50mjUnadsnt9UqpeTOMowAEcwjG4cApVuIa1IHAIzDK7xZT9aL9W59TFoXrOnMHvyB9fkDLsOWpg=</latexit>

lp =

p

sX

i

xp

i

<latexit sha1_base64="yLJpu1cRV5/0ghZ5Pmx7ETiJRo=">ACBXicbVDLSsNAFJ3UV62vqEtdDBbBVUlE0I1QdOygn1AEsNkOmHTibjzEQsoRs3/obF4q49R/c+TdO2y09cCFwzn3cu89kWBUacf5tkoLi0vLK+XVytr6xuaWvb3TUmkmMWnilKWyEyFGOWkqalmpCMkQUnESDsaXI79j2Riqb8Rg8FCRLU4zSmGkjhfY+CwU8h76k9oTQe6rLAkpfAjprRiFdtWpORPAeIWpAoKNEL7y+mOEsI15ghpTzXETrIkdQUMzKq+JkiAuEB6hHPUI4SoJ8sUIHhqlC+NUmuIaTtTfEzlKlBomkelMkO6rW8s/ud5mY7PgpxykWnC8XRnDGoUziOBHapJFizoSEIS2puhbiPJMLaBFcxIbizL8+T1nHNdWru9Um1flHEUQZ74AcARecgjq4Ag3QBg8gmfwCt6sJ+vFerc+pq0lq5jZBX9gf4APaGYaA=</latexit><latexit sha1_base64="yLJpu1cRV5/0ghZ5Pmx7ETiJRo=">ACBXicbVDLSsNAFJ3UV62vqEtdDBbBVUlE0I1QdOygn1AEsNkOmHTibjzEQsoRs3/obF4q49R/c+TdO2y09cCFwzn3cu89kWBUacf5tkoLi0vLK+XVytr6xuaWvb3TUmkmMWnilKWyEyFGOWkqalmpCMkQUnESDsaXI79j2Riqb8Rg8FCRLU4zSmGkjhfY+CwU8h76k9oTQe6rLAkpfAjprRiFdtWpORPAeIWpAoKNEL7y+mOEsI15ghpTzXETrIkdQUMzKq+JkiAuEB6hHPUI4SoJ8sUIHhqlC+NUmuIaTtTfEzlKlBomkelMkO6rW8s/ud5mY7PgpxykWnC8XRnDGoUziOBHapJFizoSEIS2puhbiPJMLaBFcxIbizL8+T1nHNdWru9Um1flHEUQZ74AcARecgjq4Ag3QBg8gmfwCt6sJ+vFerc+pq0lq5jZBX9gf4APaGYaA=</latexit><latexit sha1_base64="yLJpu1cRV5/0ghZ5Pmx7ETiJRo=">ACBXicbVDLSsNAFJ3UV62vqEtdDBbBVUlE0I1QdOygn1AEsNkOmHTibjzEQsoRs3/obF4q49R/c+TdO2y09cCFwzn3cu89kWBUacf5tkoLi0vLK+XVytr6xuaWvb3TUmkmMWnilKWyEyFGOWkqalmpCMkQUnESDsaXI79j2Riqb8Rg8FCRLU4zSmGkjhfY+CwU8h76k9oTQe6rLAkpfAjprRiFdtWpORPAeIWpAoKNEL7y+mOEsI15ghpTzXETrIkdQUMzKq+JkiAuEB6hHPUI4SoJ8sUIHhqlC+NUmuIaTtTfEzlKlBomkelMkO6rW8s/ud5mY7PgpxykWnC8XRnDGoUziOBHapJFizoSEIS2puhbiPJMLaBFcxIbizL8+T1nHNdWru9Um1flHEUQZ74AcARecgjq4Ag3QBg8gmfwCt6sJ+vFerc+pq0lq5jZBX9gf4APaGYaA=</latexit><latexit sha1_base64="yLJpu1cRV5/0ghZ5Pmx7ETiJRo=">ACBXicbVDLSsNAFJ3UV62vqEtdDBbBVUlE0I1QdOygn1AEsNkOmHTibjzEQsoRs3/obF4q49R/c+TdO2y09cCFwzn3cu89kWBUacf5tkoLi0vLK+XVytr6xuaWvb3TUmkmMWnilKWyEyFGOWkqalmpCMkQUnESDsaXI79j2Riqb8Rg8FCRLU4zSmGkjhfY+CwU8h76k9oTQe6rLAkpfAjprRiFdtWpORPAeIWpAoKNEL7y+mOEsI15ghpTzXETrIkdQUMzKq+JkiAuEB6hHPUI4SoJ8sUIHhqlC+NUmuIaTtTfEzlKlBomkelMkO6rW8s/ud5mY7PgpxykWnC8XRnDGoUziOBHapJFizoSEIS2puhbiPJMLaBFcxIbizL8+T1nHNdWru9Um1flHEUQZ74AcARecgjq4Ag3QBg8gmfwCt6sJ+vFerc+pq0lq5jZBX9gf4APaGYaA=</latexit>
slide-18
SLIDE 18
  • Linear Regression — No regularization
  • Lasso Regression — Linear regression with L1 penalty on the loss
  • Ridge Regression— Linear regression with L2 penalty on the loss
  • Logistic Regression usually uses l1 or l2 regularization by default (e.g. in

sklearn)

Norms

minw

  • (y − w · x)2 + λl1(w)
  • <latexit sha1_base64="nkL7YH3Jzj+iHhl4tBqfbQBfQN0=">ACHnicbVDLTgIxFO3gC/GFunRzIzGBGMkM0eiS6MYlJvJIACedToGTmfSdkRC+BI3/obFxpj4kr/xg6wUPAkTU7Pufe293gRZ0rb9reVWlpeWV1Lr2c2Nre2d7K7ezUVxpLQKgl5KBseVpQzQauaU4bkaQ48Dite/2rxK/fU6lYKG71MKLtAHcF6zCtZHc7FnAhDuAlse6ecgP4QTMhfihocC3EJjqHFzTgfA3ed/KCQVBbcbM4u2hPAInFmJIdmqLjZz5YfkjigQhOlWo6dqTbIyw1I5yOM61Y0QiTPu7SpqECB1S1R5P1xnBkFB86oTRHaJiovztGOFBqGHimMsC6p+a9RPzPa8a6c9EeMRHFmgoyfagTc9AhJFmBzyQlmg8NwUQy81cgPSwx0SbRjAnBmV95kdRKRcuOjenufLlLI40OkCHKI8cdI7K6BpVUBUR9Iie0St6s56sF+vd+piWpqxZz76A+vrBzUInsw=</latexit><latexit sha1_base64="nkL7YH3Jzj+iHhl4tBqfbQBfQN0=">ACHnicbVDLTgIxFO3gC/GFunRzIzGBGMkM0eiS6MYlJvJIACedToGTmfSdkRC+BI3/obFxpj4kr/xg6wUPAkTU7Pufe293gRZ0rb9reVWlpeWV1Lr2c2Nre2d7K7ezUVxpLQKgl5KBseVpQzQauaU4bkaQ48Dite/2rxK/fU6lYKG71MKLtAHcF6zCtZHc7FnAhDuAlse6ecgP4QTMhfihocC3EJjqHFzTgfA3ed/KCQVBbcbM4u2hPAInFmJIdmqLjZz5YfkjigQhOlWo6dqTbIyw1I5yOM61Y0QiTPu7SpqECB1S1R5P1xnBkFB86oTRHaJiovztGOFBqGHimMsC6p+a9RPzPa8a6c9EeMRHFmgoyfagTc9AhJFmBzyQlmg8NwUQy81cgPSwx0SbRjAnBmV95kdRKRcuOjenufLlLI40OkCHKI8cdI7K6BpVUBUR9Iie0St6s56sF+vd+piWpqxZz76A+vrBzUInsw=</latexit><latexit sha1_base64="nkL7YH3Jzj+iHhl4tBqfbQBfQN0=">ACHnicbVDLTgIxFO3gC/GFunRzIzGBGMkM0eiS6MYlJvJIACedToGTmfSdkRC+BI3/obFxpj4kr/xg6wUPAkTU7Pufe293gRZ0rb9reVWlpeWV1Lr2c2Nre2d7K7ezUVxpLQKgl5KBseVpQzQauaU4bkaQ48Dite/2rxK/fU6lYKG71MKLtAHcF6zCtZHc7FnAhDuAlse6ecgP4QTMhfihocC3EJjqHFzTgfA3ed/KCQVBbcbM4u2hPAInFmJIdmqLjZz5YfkjigQhOlWo6dqTbIyw1I5yOM61Y0QiTPu7SpqECB1S1R5P1xnBkFB86oTRHaJiovztGOFBqGHimMsC6p+a9RPzPa8a6c9EeMRHFmgoyfagTc9AhJFmBzyQlmg8NwUQy81cgPSwx0SbRjAnBmV95kdRKRcuOjenufLlLI40OkCHKI8cdI7K6BpVUBUR9Iie0St6s56sF+vd+piWpqxZz76A+vrBzUInsw=</latexit><latexit sha1_base64="nkL7YH3Jzj+iHhl4tBqfbQBfQN0=">ACHnicbVDLTgIxFO3gC/GFunRzIzGBGMkM0eiS6MYlJvJIACedToGTmfSdkRC+BI3/obFxpj4kr/xg6wUPAkTU7Pufe293gRZ0rb9reVWlpeWV1Lr2c2Nre2d7K7ezUVxpLQKgl5KBseVpQzQauaU4bkaQ48Dite/2rxK/fU6lYKG71MKLtAHcF6zCtZHc7FnAhDuAlse6ecgP4QTMhfihocC3EJjqHFzTgfA3ed/KCQVBbcbM4u2hPAInFmJIdmqLjZz5YfkjigQhOlWo6dqTbIyw1I5yOM61Y0QiTPu7SpqECB1S1R5P1xnBkFB86oTRHaJiovztGOFBqGHimMsC6p+a9RPzPa8a6c9EeMRHFmgoyfagTc9AhJFmBzyQlmg8NwUQy81cgPSwx0SbRjAnBmV95kdRKRcuOjenufLlLI40OkCHKI8cdI7K6BpVUBUR9Iie0St6s56sF+vd+piWpqxZz76A+vrBzUInsw=</latexit>

minw

  • (y − w · x)2 + λl2(w)
  • <latexit sha1_base64="W29i2vXKmscygEDuhoTnk5Uj6E=">ACHnicbVDLTgIxFO3gC/GFunRzIzGBGMkM0eiS6MYlJvJIACedToGTmfSdkRC+BI3/obFxpj4kr/xg6wUPAkTU7Pufe293gRZ0rb9reVWlpeWV1Lr2c2Nre2d7K7ezUVxpLQKgl5KBseVpQzQauaU4bkaQ48Dite/2rxK/fU6lYKG71MKLtAHcF6zCtZHc7FnAhDuAlse6ecgP4QTMhfihocC3EJjqHFzTgfA3dL+UEhqSy42ZxdtCeAReLMSA7NUHGzny0/JHFAhSYcK9V07Ei3R1hqRjgdZ1qxohEmfdylTUMFDqhqjybrjeHIKD50QmO0DBRf3eMcKDUMPBMZYB1T817ifif14x156I9YiKNRVk+lAn5qBDSLICn0lKNB8agolk5q9Aelhiok2iGROCM7/yIqmVio5dG5Oc+XLWRxpdIAOUR456ByV0TWqoCoi6BE9o1f0Zj1ZL9a79TEtTVmzn30B9bXDzaUns0=</latexit><latexit sha1_base64="W29i2vXKmscygEDuhoTnk5Uj6E=">ACHnicbVDLTgIxFO3gC/GFunRzIzGBGMkM0eiS6MYlJvJIACedToGTmfSdkRC+BI3/obFxpj4kr/xg6wUPAkTU7Pufe293gRZ0rb9reVWlpeWV1Lr2c2Nre2d7K7ezUVxpLQKgl5KBseVpQzQauaU4bkaQ48Dite/2rxK/fU6lYKG71MKLtAHcF6zCtZHc7FnAhDuAlse6ecgP4QTMhfihocC3EJjqHFzTgfA3dL+UEhqSy42ZxdtCeAReLMSA7NUHGzny0/JHFAhSYcK9V07Ei3R1hqRjgdZ1qxohEmfdylTUMFDqhqjybrjeHIKD50QmO0DBRf3eMcKDUMPBMZYB1T817ifif14x156I9YiKNRVk+lAn5qBDSLICn0lKNB8agolk5q9Aelhiok2iGROCM7/yIqmVio5dG5Oc+XLWRxpdIAOUR456ByV0TWqoCoi6BE9o1f0Zj1ZL9a79TEtTVmzn30B9bXDzaUns0=</latexit><latexit sha1_base64="W29i2vXKmscygEDuhoTnk5Uj6E=">ACHnicbVDLTgIxFO3gC/GFunRzIzGBGMkM0eiS6MYlJvJIACedToGTmfSdkRC+BI3/obFxpj4kr/xg6wUPAkTU7Pufe293gRZ0rb9reVWlpeWV1Lr2c2Nre2d7K7ezUVxpLQKgl5KBseVpQzQauaU4bkaQ48Dite/2rxK/fU6lYKG71MKLtAHcF6zCtZHc7FnAhDuAlse6ecgP4QTMhfihocC3EJjqHFzTgfA3dL+UEhqSy42ZxdtCeAReLMSA7NUHGzny0/JHFAhSYcK9V07Ei3R1hqRjgdZ1qxohEmfdylTUMFDqhqjybrjeHIKD50QmO0DBRf3eMcKDUMPBMZYB1T817ifif14x156I9YiKNRVk+lAn5qBDSLICn0lKNB8agolk5q9Aelhiok2iGROCM7/yIqmVio5dG5Oc+XLWRxpdIAOUR456ByV0TWqoCoi6BE9o1f0Zj1ZL9a79TEtTVmzn30B9bXDzaUns0=</latexit><latexit sha1_base64="W29i2vXKmscygEDuhoTnk5Uj6E=">ACHnicbVDLTgIxFO3gC/GFunRzIzGBGMkM0eiS6MYlJvJIACedToGTmfSdkRC+BI3/obFxpj4kr/xg6wUPAkTU7Pufe293gRZ0rb9reVWlpeWV1Lr2c2Nre2d7K7ezUVxpLQKgl5KBseVpQzQauaU4bkaQ48Dite/2rxK/fU6lYKG71MKLtAHcF6zCtZHc7FnAhDuAlse6ecgP4QTMhfihocC3EJjqHFzTgfA3dL+UEhqSy42ZxdtCeAReLMSA7NUHGzny0/JHFAhSYcK9V07Ei3R1hqRjgdZ1qxohEmfdylTUMFDqhqjybrjeHIKD50QmO0DBRf3eMcKDUMPBMZYB1T817ifif14x156I9YiKNRVk+lAn5qBDSLICn0lKNB8agolk5q9Aelhiok2iGROCM7/yIqmVio5dG5Oc+XLWRxpdIAOUR456ByV0TWqoCoi6BE9o1f0Zj1ZL9a79TEtTVmzn30B9bXDzaUns0=</latexit>

minw

  • (y − w · x)2)
<latexit sha1_base64="FElZ1VQTiwG2SiAP1faFW0hndoA=">ACnicbVC7TsMwFHXKq5RXgJHFUCG1A1VSIcFYwcJYJPqQ2hA5jtNadZzIdoAo6szCr7AwgBArX8DG3+C2GaDlSFc6Pude+d7jxYxKZVnfRmFpeWV1rbhe2tjc2t4xd/faMkoEJi0csUh0PSQJo5y0FWMdGNBUOgx0vFGlxO/c0eEpBG/UWlMnBANOA0oRkpLrnkYUu7ew75HBxVYSeEJ1A/sRwo+VOEtrMOqa5atmjUFXCR2TsogR9M1v/p+hJOQcIUZkrJnW7FyMiQUxYyMS/1EkhjhERqQnqYchUQ62fSUMTzWig+DSOjiCk7V3xMZCqVMQ093hkgN5bw3Ef/zeokKzp2M8jhRhOPZR0HCoIrgJBfoU0GwYqkmCAuqd4V4iATCSqdX0iHY8ycvkna9Zls1+/q03LjI4yiCA3AEKsAGZ6ABrkATtAGj+AZvI348l4Md6Nj1lrwchn9sEfGJ8/emKXig=</latexit><latexit sha1_base64="FElZ1VQTiwG2SiAP1faFW0hndoA=">ACnicbVC7TsMwFHXKq5RXgJHFUCG1A1VSIcFYwcJYJPqQ2hA5jtNadZzIdoAo6szCr7AwgBArX8DG3+C2GaDlSFc6Pude+d7jxYxKZVnfRmFpeWV1rbhe2tjc2t4xd/faMkoEJi0csUh0PSQJo5y0FWMdGNBUOgx0vFGlxO/c0eEpBG/UWlMnBANOA0oRkpLrnkYUu7ew75HBxVYSeEJ1A/sRwo+VOEtrMOqa5atmjUFXCR2TsogR9M1v/p+hJOQcIUZkrJnW7FyMiQUxYyMS/1EkhjhERqQnqYchUQ62fSUMTzWig+DSOjiCk7V3xMZCqVMQ093hkgN5bw3Ef/zeokKzp2M8jhRhOPZR0HCoIrgJBfoU0GwYqkmCAuqd4V4iATCSqdX0iHY8ycvkna9Zls1+/q03LjI4yiCA3AEKsAGZ6ABrkATtAGj+AZvI348l4Md6Nj1lrwchn9sEfGJ8/emKXig=</latexit><latexit sha1_base64="FElZ1VQTiwG2SiAP1faFW0hndoA=">ACnicbVC7TsMwFHXKq5RXgJHFUCG1A1VSIcFYwcJYJPqQ2hA5jtNadZzIdoAo6szCr7AwgBArX8DG3+C2GaDlSFc6Pude+d7jxYxKZVnfRmFpeWV1rbhe2tjc2t4xd/faMkoEJi0csUh0PSQJo5y0FWMdGNBUOgx0vFGlxO/c0eEpBG/UWlMnBANOA0oRkpLrnkYUu7ew75HBxVYSeEJ1A/sRwo+VOEtrMOqa5atmjUFXCR2TsogR9M1v/p+hJOQcIUZkrJnW7FyMiQUxYyMS/1EkhjhERqQnqYchUQ62fSUMTzWig+DSOjiCk7V3xMZCqVMQ093hkgN5bw3Ef/zeokKzp2M8jhRhOPZR0HCoIrgJBfoU0GwYqkmCAuqd4V4iATCSqdX0iHY8ycvkna9Zls1+/q03LjI4yiCA3AEKsAGZ6ABrkATtAGj+AZvI348l4Md6Nj1lrwchn9sEfGJ8/emKXig=</latexit><latexit sha1_base64="FElZ1VQTiwG2SiAP1faFW0hndoA=">ACnicbVC7TsMwFHXKq5RXgJHFUCG1A1VSIcFYwcJYJPqQ2hA5jtNadZzIdoAo6szCr7AwgBArX8DG3+C2GaDlSFc6Pude+d7jxYxKZVnfRmFpeWV1rbhe2tjc2t4xd/faMkoEJi0csUh0PSQJo5y0FWMdGNBUOgx0vFGlxO/c0eEpBG/UWlMnBANOA0oRkpLrnkYUu7ew75HBxVYSeEJ1A/sRwo+VOEtrMOqa5atmjUFXCR2TsogR9M1v/p+hJOQcIUZkrJnW7FyMiQUxYyMS/1EkhjhERqQnqYchUQ62fSUMTzWig+DSOjiCk7V3xMZCqVMQ093hkgN5bw3Ef/zeokKzp2M8jhRhOPZR0HCoIrgJBfoU0GwYqkmCAuqd4V4iATCSqdX0iHY8ycvkna9Zls1+/q03LjI4yiCA3AEKsAGZ6ABrkATtAGj+AZvI348l4Md6Nj1lrwchn9sEfGJ8/emKXig=</latexit>
slide-19
SLIDE 19

Feature Selection

  • Explicitly remove features from model before training
  • Lots of heuristic techniques (no magic solutions, requires trial and error)
  • Some techniques:
  • Remove correlated features
  • Remove low-variance features
  • Iteratively add features with highest weight or information gain
  • Iteratively remove features with lowest weight or information gain
  • Dimensionality Reduction (e.g. SVD, which you are all experts in now)