Regularization The problem of overfitting Machine Learning - - PowerPoint PPT Presentation

regularization
SMART_READER_LITE
LIVE PREVIEW

Regularization The problem of overfitting Machine Learning - - PowerPoint PPT Presentation

Regularization The problem of overfitting Machine Learning Example: Linear regression (housing prices) Price Price Price Size Size Size Overfitting: If we have too many features, the learned hypothesis may fit the training set very well (


slide-1
SLIDE 1

Regularization

The problem of

  • verfitting

Machine Learning

slide-2
SLIDE 2

Andrew Ng

Example: Linear regression (housing prices) Overfitting: If we have too many features, the learned hypothesis may fit the training set very well ( ), but fail to generalize to new examples (predict prices on new examples).

Price Size Price Size Price Size

slide-3
SLIDE 3

Andrew Ng

Example: Logistic regression

( = sigmoid function)

x1 x2 x1 x2 x1 x2

slide-4
SLIDE 4

Andrew Ng

Addressing overfitting:

Price Size

size of house

  • no. of bedrooms
  • no. of floors

age of house average income in neighborhood kitchen size

slide-5
SLIDE 5

Andrew Ng

Addressing overfitting: Options:

  • 1. Reduce number of features.

― Manually select which features to keep. ― Model selection algorithm (later in course).

  • 2. Regularization.

― Keep all the features, but reduce magnitude/values of parameters . ― Works well when we have a lot of features, each of which contributes a bit to predicting .

slide-6
SLIDE 6
slide-7
SLIDE 7

Regularization Cost function

Machine Learning

slide-8
SLIDE 8

Andrew Ng

Intuition Suppose we penalize and make , really small.

Price Size of house Price Size of house

slide-9
SLIDE 9

Andrew Ng

Small values for parameters ― “Simpler” hypothesis ― Less prone to overfitting Regularization. Housing: ― Features: ― Parameters:

slide-10
SLIDE 10

Andrew Ng

Regularization.

Price Size of house

slide-11
SLIDE 11

Andrew Ng

In regularized linear regression, we choose to minimize What if is set to an extremely large value (perhaps for too large for our problem, say )?

  • Algorithm works fine; setting to be very large can’t hurt it
  • Algortihm fails to eliminate overfitting.
  • Algorithm results in underfitting. (Fails to fit even training data

well).

  • Gradient descent will fail to converge.
slide-12
SLIDE 12

Andrew Ng

In regularized linear regression, we choose to minimize What if is set to an extremely large value (perhaps for too large for our problem, say )?

Price Size of house

slide-13
SLIDE 13
slide-14
SLIDE 14

Regularization

Regularized linear regression

Machine Learning

slide-15
SLIDE 15

Regularized linear regression

slide-16
SLIDE 16

Andrew Ng

Gradient descent

Repeat

slide-17
SLIDE 17

Andrew Ng

Normal equation

slide-18
SLIDE 18

Andrew Ng

Suppose , Non-invertibility (optional/advanced).

(#examples) (#features)

If ,

slide-19
SLIDE 19
slide-20
SLIDE 20

Regularization

Regularized logistic regression

Machine Learning

slide-21
SLIDE 21

Andrew Ng

Regularized logistic regression.

Cost function:

x1 x2

slide-22
SLIDE 22

Andrew Ng

Gradient descent

Repeat

slide-23
SLIDE 23

Andrew Ng

gradient(1) = [ ]; function [jVal, gradient] = costFunction(theta) jVal = [ ]; gradient(2) = [ ]; gradient(n+1) = [ ];

code to compute code to compute code to compute code to compute

Advanced optimization

gradient(3) = [ ];

code to compute

slide-24
SLIDE 24