T w + C Minimize T z fo r some Z spae N 1 n 2 w n =1 K - - PowerPoint PPT Presentation

t w c
SMART_READER_LITE
LIVE PREVIEW

T w + C Minimize T z fo r some Z spae N 1 n 2 w n =1 K - - PowerPoint PPT Presentation

Review of Leture 15 Soft-ma rgin SVM Kernel metho ds T w + C Minimize T z fo r some Z spae N 1 n 2 w n =1 K ( x , x ) = z Hi Hi violation Same as ha rd ma rgin, but 0 n C Hi Hi


slide-1
SLIDE 1 Review
  • f
Le ture 15
  • Kernel
metho ds

K(x, x′) = z

Tz′ fo r some Z spa e

Hi Hi

↑ K(x, x′) = exp

  • −γ x − x′2
  • Soft-ma
rgin SVM Minimize

1 2 w

Tw + C

N

  • n=1

ξn

Hi Hi

violation

Same as ha rd ma rgin, but 0 ≤ αn≤ C
slide-2
SLIDE 2 Lea rning F rom Data Y aser S. Abu-Mostafa Califo rnia Institute
  • f
T e hnology Le ture 16: Radial Basis F un tions Sp
  • nso
red b y Calte h's Provost O e, E&AS Division, and IST
  • Thursda
y , Ma y 24, 2012
slide-3
SLIDE 3 Outline
  • RBF
and nea rest neighb
  • rs
  • RBF
and neural net w
  • rks
  • RBF
and k ernel metho ds
  • RBF
and regula rization

A

M L

Creato r: Y aser Abu-Mostafa
  • LFD
Le ture 16 2/20
slide-4
SLIDE 4 Basi RBF mo del Ea h (xn, yn) ∈ D inuen es h(x) based
  • n x − xn
  • radial
Standa rd fo rm:

h(x) =

N

  • n=1

wn exp

  • −γ x − xn2
  • basis
fun tion

A

M L

Creato r: Y aser Abu-Mostafa
  • LFD
Le ture 16 3/20
slide-5
SLIDE 5 The lea rning algo rithm Finding w1, · · · , wN :

h(x) =

N

  • n=1

wn exp

  • −γ x − xn2
based
  • n D = (x1, y1), · · · , (xN, yN)

E

in = 0:

h(xn) = yn

fo r n = 1, · · · , N :

N

  • m=1

wm exp

  • −γ xn − xm2

= yn

A

M L

Creato r: Y aser Abu-Mostafa
  • LFD
Le ture 16 4/20
slide-6
SLIDE 6 The solution

N

  • m=1

wm exp

  • −γ xn − xm2

= yn N

equations in N unkno wns

     exp(−γ x1 − x12) . . . exp(−γ x1 − xN2) exp(−γ x2 − x12) . . . exp(−γ x2 − xN2)

. . . . . . . . .

exp(−γ xN − x12) . . . exp(−γ xN − xN2)     

  • Φ

     w1 w2

. . .

wN      w =      y1 y2

. . .

yN      y

If Φ is invertible,

w = Φ−1y

exa t interp
  • lation

A

M L

Creato r: Y aser Abu-Mostafa
  • LFD
Le ture 16 5/20
slide-7
SLIDE 7 The ee t
  • f γ

h(x) =

N

  • n=1

wn exp

  • −γ x − xn2
small γ la rge γ

A

M L

Creato r: Y aser Abu-Mostafa
  • LFD
Le ture 16 6/20
slide-8
SLIDE 8 RBF fo r lassi ation

h(x) =

sign

N

  • n=1

wn exp

  • −γ x − xn2
Lea rning: ∼ linea r regression fo r lassi ation

s =

N

  • n=1

wn exp

  • −γ x − xn2
Minimize (s − y)2
  • n D

y = ±1 h(x) =

sign(s)

A

M L

Creato r: Y aser Abu-Mostafa
  • LFD
Le ture 16 7/20
slide-9
SLIDE 9 Relationship to nea rest-neighb
  • r
metho d A dopt the y value
  • f
a nea rb y p
  • int:
simila r ee t b y a basis fun tion:

A

M L

Creato r: Y aser Abu-Mostafa
  • LFD
Le ture 16 8/20
slide-10
SLIDE 10 RBF with K enters

N

pa rameters w1, · · · , wN based
  • n N
data p
  • ints
Use K ≪ N enters: µ1, · · · , µK instead
  • f x1, · · · , xN

h(x) =

K

  • k=1

wk exp

  • −γ x − µk2
1. Ho w to ho
  • se
the enters µk 2. Ho w to ho
  • se
the w eights wk

A

M L

Creato r: Y aser Abu-Mostafa
  • LFD
Le ture 16 9/20
slide-11
SLIDE 11 Cho
  • sing
the enters Minimize the distan e b et w een xn and the losest enter µk :

K

  • means
lustering Split x1, · · · , xN into lusters S1, · · · , SK Minimize

K

  • k=1
  • xn∈Sk

xn − µk2

Unsup ervised lea rning NP
  • ha
rd

A

M L

Creato r: Y aser Abu-Mostafa
  • LFD
Le ture 16 10/20
slide-12
SLIDE 12 An iterative algo rithm Llo yd's algo rithm: Iteratively minimize

K

  • k=1
  • xn∈Sk

xn − µk2

w.r.t.

µk, Sk µk ← 1 |Sk|

  • xn∈Sk

xn Sk ← {xn : xn − µk ≤

all xn − µℓ} Convergen e

− →

lo al minimum

A

M L

Creato r: Y aser Abu-Mostafa
  • LFD
Le ture 16 11/20
slide-13
SLIDE 13 Llo yd's algo rithm in a tion

Hi Hi

1. Get the data p
  • ints
2. Only the inputs! 3. Initialize the enters 4. Iterate 5. These a re y
  • ur µk
's

A

M L

Creato r: Y aser Abu-Mostafa
  • LFD
Le ture 16 12/20
slide-14
SLIDE 14 Centers versus supp
  • rt
ve to rs supp
  • rt
ve to rs RBF enters

Hi Hi Hi Hi

A

M L

Creato r: Y aser Abu-Mostafa
  • LFD
Le ture 16 13/20
slide-15
SLIDE 15 Cho
  • sing
the w eights

K

  • k=1

wk exp

  • −γ xn − µk2

≈ yn N

equations in K< N unkno wns

     exp(−γ x1 − µ12) . . . exp(−γ x1 − µK2) exp(−γ x2 − µ12) . . . exp(−γ x2 − µK2)

. . . . . . . . .

exp(−γ xN − µ12) . . . exp(−γ xN − µK2)     

  • Φ

     w1 w2

. . .

wK      w ≈      y1 y2

. . .

yN      y

If Φ TΦ is invertible,

w = (Φ

TΦ)−1Φ Ty pseudo-inverse

A

M L

Creato r: Y aser Abu-Mostafa
  • LFD
Le ture 16 14/20
slide-16
SLIDE 16 RBF net w
  • rk

b − → h(x) · · · · · · φ x φ φ ↑

x − µk x − µK

wk wK w1

x − µ1

The features a re

exp

  • −γ x − µk2
Nonlinea r transfo rm dep ends
  • n D

= ⇒

No longer a linea r mo del A bias term (b
  • r w0
) is
  • ften
added

A

M L

Creato r: Y aser Abu-Mostafa
  • LFD
Le ture 16 15/20
slide-17
SLIDE 17 Compa re to neural net w
  • rks

h(x) · · · · · · φ x φ φ ↑

x − µk x − µK

wk wK w1

x − µ1

h(x) · · · · · · θ x θ θ ↑

w

T

kx

w

T

Kx

wk wK w1

w

T

1x

RBF net w
  • rk
neural net w
  • rk

A

M L

Creato r: Y aser Abu-Mostafa
  • LFD
Le ture 16 16/20
slide-18
SLIDE 18 Cho
  • sing γ
T reating γ as a pa rameter to b e lea rned

h(x) =

K

  • k=1

wk exp

  • −γ x − µk2
Iterative app roa h (∼ EM algo rithm in mixture
  • f
Gaussians): 1. Fix γ , solve fo r w1, · · · , wK 2. Fix w1, · · · , wK , minimize erro r w.r.t. γ W e an have a dierent γk fo r ea h enter µk

A

M L

Creato r: Y aser Abu-Mostafa
  • LFD
Le ture 16 17/20
slide-19
SLIDE 19 Outline
  • RBF
and nea rest neighb
  • rs
  • RBF
and neural net w
  • rks
  • RBF
and k ernel metho ds
  • RBF
and regula rization

A

M L

Creato r: Y aser Abu-Mostafa
  • LFD
Le ture 16 18/20
slide-20
SLIDE 20 RBF versus its SVM k ernel

Hi Hi

RBF SVM

SVM k ernel implements: sign

 

αn>0

αnyn exp

  • −γ x − xn2

+ b  

Straight RBF implements: sign
  • K
  • k=1

wk exp

  • −γ x − µk2

+ b

  • A

M L

Creato r: Y aser Abu-Mostafa
  • LFD
Le ture 16 19/20
slide-21
SLIDE 21 RBF and regula rization RBF an b e derived based purely
  • n
regula rization:

N

  • n=1
  • h(xn) − yn

2 + λ

  • k=0

ak ∞

−∞

dkh dxk 2 dx

smo
  • thest
interp
  • lation

A

M L

Creato r: Y aser Abu-Mostafa
  • LFD
Le ture 16 20/20