Learning From Data Lecture 10 Nonlinear Transforms The Z -space - - PowerPoint PPT Presentation

learning from data lecture 10 nonlinear transforms
SMART_READER_LITE
LIVE PREVIEW

Learning From Data Lecture 10 Nonlinear Transforms The Z -space - - PowerPoint PPT Presentation

Learning From Data Lecture 10 Nonlinear Transforms The Z -space Polynomial transforms Be careful M. Magdon-Ismail CSCI 4100/6100 recap: The Linear Model linear in x : gives the line/hyperplane separator s = w t x linear in w : makes


slide-1
SLIDE 1

Learning From Data Lecture 10 Nonlinear Transforms

The Z-space Polynomial transforms Be careful

  • M. Magdon-Ismail

CSCI 4100/6100

slide-2
SLIDE 2

recap: The Linear Model

linear in x: gives the line/hyperplane separator

↓ s = wtx ↑

linear in w: makes the algorithms work

Credit Analysis Amount

  • f Credit

Approve

  • r Deny

Probability

  • f Default

Perceptron Logistic Regression Linear Regression Classification Error PLA, Pocket,.. . Cross-entropy Error Gradient descent Squared Error Pseudo-inverse

c A M L Creator: Malik Magdon-Ismail

Nonlinear Transforms: 2 /18

Limitations of linear − →

slide-3
SLIDE 3

The Linear Model has its Limits

(a) Linear with outliers (b) Essentially nonlinear To address (b) we need something more than linear.

c A M L Creator: Malik Magdon-Ismail

Nonlinear Transforms: 3 /18

Change the features − →

slide-4
SLIDE 4

Change Your Features

Years in Residence, Y Income Y ≫ 3 years

no additional effect beyond Y = 3;

Y ≪ 0.3 years

no additional effect below Y = 0.3.

c A M L Creator: Malik Magdon-Ismail

Nonlinear Transforms: 4 /18

‘Transform’ your features − →

slide-5
SLIDE 5

Change Your Features Using a Transform

Years in Residence, Y Income z1 Income

z1 Y

c A M L Creator: Malik Magdon-Ismail

Nonlinear Transforms: 5 /18

Feature transform I: Z-space − →

slide-6
SLIDE 6

Mechanics of the Feature Transform I

Transform the data to a Z-space in which the data is separable.

x1 x2

− →

z1 = x2

1

z2 = x2

2

x =

  

1 x1 x2

  

− →

z = Φ(x) =

  

1 x2

1

x2

2

   =   

1 Φ1(x) Φ2(x)

  

c A M L Creator: Malik Magdon-Ismail

Nonlinear Transforms: 6 /18

Feature transform II: classify in Z-space − →

slide-7
SLIDE 7

Mechanics of the Feature Transform II

Separate the data in the Z-space with ˜ w: ˜ g(z) = sign( ˜ wtz)

− →

c A M L Creator: Malik Magdon-Ismail

Nonlinear Transforms: 7 /18

Feature transform III: bring back to X-space − →

slide-8
SLIDE 8

Mechanics of the Feature Transform III

To classify a new x, first transform x to Φ(x) ∈ Z-space and classify there with ˜ g. g(x) = ˜ g(Φ(x)) = sign( ˜ wtΦ(x)) ˜ g(z) = sign( ˜ wtz)

← −

c A M L Creator: Malik Magdon-Ismail

Nonlinear Transforms: 8 /18

Summary of nonlinear transform − →

slide-9
SLIDE 9

The General Feature Transform

X-space is Rd Z-space is R ˜

d

x =

   

1 x1 . . . xd

   

z = Φ(x) =

   

1 Φ1(x) . . . Φ ˜

d(x)

    =    

1 z1 . . . z ˜

d

   

x1, x2, . . . , xN z1, z2, . . . , zN y1, y2, . . . , yN y1, y2, . . . , yN no weights ˜ w =

   

w0 w1 . . . w ˜

d

   

g(x) = sign( ˜ wtΦ(x))

c A M L Creator: Malik Magdon-Ismail

Nonlinear Transforms: 9 /18

Generalization − →

slide-10
SLIDE 10

Generalization

dvc ˜ dvc d + 1

− →

˜ d + 1 Choose the feature transform with smallest ˜ d

c A M L Creator: Malik Magdon-Ismail

Nonlinear Transforms: 10 /18

Many possibilities to choose from − →

slide-11
SLIDE 11

Many Nonlinear Features May Work

x1 x2

← − x2

1 + x2 2 = 0.6

z1 = (x1 + 0.05)2 z2 = x2 z1 = x2

1

z2 = x2

2

z1 = x2

1 + x2 2 − 0.6

A rat! A rat!

This is called data snooping: looking at your data and tailoring your H.

c A M L Creator: Malik Magdon-Ismail

Nonlinear Transforms: 11 /18

Many possibilities to choose from − →

slide-12
SLIDE 12

Many Nonlinear Features May Work

x1 x2

← − x2

1 + x2 2 = 0.6

z1 = (x1 + 0.05)2 z2 = x2 z1 = x2

1

z2 = x2

2

z1 = x2

1 + x2 2 − 0.6

A rat! A rat!

This is called data snooping: looking at your data and tailoring your H.

c A M L Creator: Malik Magdon-Ismail

Nonlinear Transforms: 12 /18

Choose before looking at data − →

slide-13
SLIDE 13

Must Choose Φ BEFORE Your Look at the Data

After constructing features carefully, before seeing the data . . . . . . if you think linear is not enough, try the 2nd order polynomial transform.

  

1 x1 x2

   = x −

Φ(x) =

          

1 Φ1(x) Φ2(x) Φ3(x) Φ4(x) Φ5(x)

          

=

          

1 x1 x2 x2

1

x1x2 x2

2

          

c A M L Creator: Malik Magdon-Ismail

Nonlinear Transforms: 13 /18

The polynomial transform − →

slide-14
SLIDE 14

The General Polynomial Transform Φk

We can get even fancier: degree-k polynomial transform: Φ1(x) = (1, x1, x2), Φ2(x) = (1, x1, x2, x2

1, x1x2, x2 2),

Φ3(x) = (1, x1, x2, x2

1, x1x2, x2 2, x3 1, x2 1x2, x1x2 2, x3 2),

Φ4(x) = (1, x1, x2, x2

1, x1x2, x2 2, x3 1, x2 1x2, x1x2 2, x3 2, x4 1, x3 1x2, x2 1x2 2, x1x3 2, x4 2),

. . . – Dimensionality of the feature space increases rapidly (dvc)! – Similar transforms for d-dimensional original space. – Approximation-generalization tradeoff

Higher degree gives lower (even zero) Ein but worse generalization.

c A M L Creator: Malik Magdon-Ismail

Nonlinear Transforms: 14 /18

Be carefull with nonlinear transforms − →

slide-15
SLIDE 15

Be Careful with Feature Transforms

c A M L Creator: Malik Magdon-Ismail

Nonlinear Transforms: 15 /18

Insist on Ein = 0 − →

slide-16
SLIDE 16

Be Careful with Feature Transforms

High order polynomial transform leads to “nonsense”.

c A M L Creator: Malik Magdon-Ismail

Nonlinear Transforms: 16 /18

Digits data − →

slide-17
SLIDE 17

Digits Data “1” Versus “All”

Average Intensity Symmetry Average Intensity Symmetry

Linear model Ein = 2.13% Eout = 2.38% 3rd order polynomial model Ein = 1.75% Eout = 1.87%

c A M L Creator: Malik Magdon-Ismail

Nonlinear Transforms: 17 /18

Use the linear model! − →

slide-18
SLIDE 18

Use the Linear Model!

  • First try a linear model – simple, robust and works.
  • Algorithms can tolerate error plus you have nonlinear feature transforms.
  • Choose a feature transform before seeing the data. Stay simple.

Data snooping is hazardous to your Eout.

  • Linear models are fundamental in their own right; they are also the building blocks
  • f many more complex models like support vector machines.
  • Nonlinear transforms also apply to regression and logistic regression.

c A M L Creator: Malik Magdon-Ismail

Nonlinear Transforms: 18 /18