0.6 out 0.5 train r Erro v al 5 25 15 eted g E - - PowerPoint PPT Presentation

0 6 out 0 5 train r erro v al 5 25 15 e ted g e d m n exp
SMART_READER_LITE
LIVE PREVIEW

0.6 out 0.5 train r Erro v al 5 25 15 eted g E - - PowerPoint PPT Presentation

PSfrag replaements Data ontamination Review of Leture 13 0.8 0.7 V alidation 0.6 out 0.5 train r Erro v al 5 25 15 eted g E D m ( N ) Exp g v al V alidation Set Size, K E m


slide-1
SLIDE 1 Review
  • f
Le ture 13
  • V
alidation

D

v al

D

(N)

D

train

(N − K)

g

(K)

E

v al(g )

g

E

v al(g−) estimates

E

  • ut(g)
  • Data
  • ntamination
PSfrag repla ements V alidation Set Size, K Exp e ted Erro r

E

v al
  • g−

m∗

  • E
  • ut
  • g−

m∗

  • 5
15 25 0.5 0.6 0.7 0.8

D

v al slightly
  • ntaminated
  • Cross
validation

D1 D2 D3 D4 D5 D6 D7 D8 D9 D10

train train v alidate

D z }| {

10-fold ross validation
slide-2
SLIDE 2 Lea rning F rom Data Y aser S. Abu-Mostafa Califo rnia Institute
  • f
T e hnology Le ture 14: Supp
  • rt
V e to r Ma hines Sp
  • nso
red b y Calte h's Provost O e, E&AS Division, and IST
  • Thursda
y , Ma y 17, 2012
slide-3
SLIDE 3 Outline
  • Maximizing
the ma rgin
  • The
solution
  • Nonlinea
r transfo rms

A

M L

Creato r: Y aser Abu-Mostafa
  • LFD
Le ture 14 2/20
slide-4
SLIDE 4 Better linea r sepa ration Linea rly sepa rable data

Hi Hi Hi Hi Hi Hi

Dierent sepa rating lines Whi h is b est? T w
  • questions:
1. Why is bigger ma rgin b etter? 2. Whi h w maximizes the ma rgin?

A

M L

Creato r: Y aser Abu-Mostafa
  • LFD
Le ture 14 3/20
slide-5
SLIDE 5 Rememb er the gro wth fun tion? All di hotomies with any line:

A

M L

Creato r: Y aser Abu-Mostafa
  • LFD
Le ture 14 4/20
slide-6
SLIDE 6 Di hotomies with fat ma rgin F at ma rgins imply few er di hotomies

0.397 0.5 0.866 infinity 0.397 0.5 0.866 infinity

A

M L

Creato r: Y aser Abu-Mostafa
  • LFD
Le ture 14 5/20
slide-7
SLIDE 7 Finding w with la rge ma rgin Let xn b e the nea rest data p
  • int
to the plane w Tx = 0. Ho w fa r is it? 2 p relimina ry te hni alities: 1. No rmalize w :

|w

Txn| = 1 2. Pull
  • ut w0
:

w = (w1, · · · , wd)

apa rt from b The plane is no w

w

Tx + b = 0 (no x0 )

A

M L

Creato r: Y aser Abu-Mostafa
  • LFD
Le ture 14 6/20
slide-8
SLIDE 8 Computing the distan e The distan e b et w een xn and the plane w Tx + b = 0 where |w Txn + b| = 1 The ve to r w is ⊥ to the plane in the X spa e:

xn

Hi Hi

x’ x’’ w

T ak e x′ and x′′
  • n
the plane

w

Tx′ + b = 0 and

w

Tx′′ + b = 0

= ⇒ w

T(x′ − x′′) = 0

A

M L

Creato r: Y aser Abu-Mostafa
  • LFD
Le ture 14 7/20
slide-9
SLIDE 9 and the distan e is . . . Distan e b et w een xn and the plane:

xn

Hi Hi

x w

T ak e any p
  • int x
  • n
the plane Proje tion
  • f xn − x
  • n w

ˆ w = w w = ⇒

distan e =
  • ˆ

w

T(xn − x)
  • distan e

= 1 w

  • w
Txn − w Tx
  • =

1 w

  • w
Txn + b − w Tx − b
  • =

1 w

A

M L

Creato r: Y aser Abu-Mostafa
  • LFD
Le ture 14 8/20
slide-10
SLIDE 10 The
  • ptimization
p roblem Maximize

1 w

subje t to

min

n=1,2,...,N |w

Txn + b| = 1 Noti e: |w Txn + b| = yn (w Txn + b) Minimize 1

2 w

Tw subje t to yn (w Txn + b) ≥ 1 fo r

n = 1, 2, . . . , N

A

M L

Creato r: Y aser Abu-Mostafa
  • LFD
Le ture 14 9/20
slide-11
SLIDE 11 Outline
  • Maximizing
the ma rgin
  • The
solution
  • Nonlinea
r transfo rms

A

M L

Creato r: Y aser Abu-Mostafa
  • LFD
Le ture 14 10/20
slide-12
SLIDE 12 Constrained
  • ptimization
Minimize

1 2 w

Tw subje t to

yn (w

Txn + b) ≥ 1 fo r

n = 1, 2, . . . , N w ∈ Rd, b ∈ R

Lagrange? inequalit y
  • nstraints =

KKT

A

M L

Creato r: Y aser Abu-Mostafa
  • LFD
Le ture 14 11/20
slide-13
SLIDE 13 W e sa w this b efo re

w

lin

w

tw = C

w E

in =
  • nst.

∇E

in normal Rememb er regula rization? Minimize

E

in(w) =

1 N (Zw − y)

T(Zw − y) subje t to:

w

Tw ≤ C

∇E

in no rmal to
  • nstraint
  • ptimize
  • nstrain
Regula rization:

E

in

w

Tw SVM:

w

Tw

E

in

A

M L

Creato r: Y aser Abu-Mostafa
  • LFD
Le ture 14 12/20
slide-14
SLIDE 14 Lagrange fo rmulation Minimize

L(w, b, α) = 1 2 w

Tw −

N

  • n=1

αn(yn (w

Txn + b) −1) w.r.t. w and b and maximize w.r.t. ea h αn ≥ 0

wL = w − N

  • n=1

αnynxn = 0 ∂L ∂b = −

N

  • n=1

αnyn = 0

A

M L

Creato r: Y aser Abu-Mostafa
  • LFD
Le ture 14 13/20
slide-15
SLIDE 15 Substituting . . .

w =

N

  • n=1

αnynxn

and

N

  • n=1

αnyn = 0

in the Lagrangian

L(w, b, α) = 1 2 w

Tw −

N

  • n=1

αn (yn (w

Txn+b) −1 ) w e get

L(α) =

N

  • n=1

αn − 1 2

N

  • n=1

N

  • m=1

ynym αnαm x

T

nxm

Maximize w.r.t. to α subje t to αn ≥ 0 fo r n = 1, · · · , N and

N

n=1 αnyn = 0

A

M L

Creato r: Y aser Abu-Mostafa
  • LFD
Le ture 14 14/20
slide-16
SLIDE 16 The solution
  • quadrati
p rogramming

min

α

1 2 α

T

     y1y1 x1

Tx1

y1y2 x1

Tx2

. . . y1yN x1

TxN

y2y1 x2

Tx1

y2y2 x2

Tx2

. . . y2yN x2

TxN

. . . . . . . . . . . . yNy1 xN

Tx1 yNy2 xN Tx2 . . . yNyN xN TxN

    

  • quadrati
  • e ients

α + (−1

T) linea r

α

subje t to

y

Tα = 0
  • linea
r
  • nstraint
  • lo
w er b
  • unds

≤ α ≤ ∞

  • upp
er b
  • unds

A

M L

Creato r: Y aser Abu-Mostafa
  • LFD
Le ture 14 15/20
slide-17
SLIDE 17 QP hands us α Solution: α = α1, · · · , αN

= ⇒ w =

N

  • n=1

αnynxn

KKT
  • ndition:
F
  • r n = 1, · · · , N

αn (yn (w

Txn + b) − 1) = 0 W e sa w this b efo re!

αn > 0 = ⇒ xn

is a supp
  • rt
ve to r

w

lin

w

tw = C

w E

in =
  • nst.

∇E

in normal

A

M L

Creato r: Y aser Abu-Mostafa
  • LFD
Le ture 14 16/20
slide-18
SLIDE 18 Supp
  • rt
ve to rs

Hi Hi

Closest xn 's to the plane: a hieve the ma rgin

= ⇒ yn (w

Txn + b) = 1

w =

  • xn
is SV

αnynxn

Solve fo r b using any SV:

yn (w

Txn + b) = 1

A

M L

Creato r: Y aser Abu-Mostafa
  • LFD
Le ture 14 17/20
slide-19
SLIDE 19 Outline
  • Maximizing
the ma rgin
  • The
solution
  • Nonlinea
r transfo rms

A

M L

Creato r: Y aser Abu-Mostafa
  • LFD
Le ture 14 18/20
slide-20
SLIDE 20

z

instead
  • f x

L(α) =

N

  • n=1

αn − 1 2

N

  • n=1

N

  • m=1

ynym αnαm z

T

nzm

PSfrag repla ements

−1 1 −1 1

PSfrag repla ements

0.5 1 0.5 1

X − → Z

A

M L

Creato r: Y aser Abu-Mostafa
  • LFD
Le ture 14 19/20
slide-21
SLIDE 21 Supp
  • rt
ve to rs in X spa e

Hi Hi

Supp
  • rt
ve to rs live in Z spa e In X spa e, p re-images
  • f
supp
  • rt
ve to rs The ma rgin is maintained in Z spa e Generalization result

E[E

  • ut] ≤

E[#

  • f
SV's]

N − 1

A

M L

Creato r: Y aser Abu-Mostafa
  • LFD
Le ture 14 20/20