Perceptrons
“From the heights of error, To the valleys of Truth”
Piyush Kumar Advanced Computational Geometry
Perceptrons From the heights of error, To the valleys of Truth - - PowerPoint PPT Presentation
Perceptrons From the heights of error, To the valleys of Truth Piyush Kumar Advanced Computational Geometry Reading Material Duda/Hart/Stork : 5.4/5.5/9.6.8 Any neural network book (Haykin, Anderson) Look at papers of
Piyush Kumar Advanced Computational Geometry
Supervised Learning
Input Pattern Output Pattern Compare and Correct if necessary
Definition
It is a function that is a linear combination of the components of x
where w is the weight vector and w0 the bias
the following rule: Decide ω1 if g(x) > 0 and ω2 if g(x) < 0
The equation g(x) = 0 defines the
When g(x) is linear, the decision
Two main approaches
Fischer’s Linear Discriminant
Linear Discrimination in d-dimensions
Proposed by Frank Rosenblatt in 1956 Neural net researchers accuse
Numerous variants We’ll cover the one that’s most
One of the simplest Neural Network.
+ 1
⎪ ⎩ ⎪ ⎨ ⎧ − > =
=
1 if 1
i n i ix
w y
Compare And correct
w1 w0 w2 w3 wn x0=-1 x1 x2 x3 xn . . .
Class 2 : (-1)
I s t hi s uni que?
Class 1 : (+ 1)
Lets assume for this talk that the red
Two Cat egor y Li near l y separ abl e case
Aka Learning Half Spaces Can be solved in polynomial time using
Can also be solved using a simple and
2 2 1 1 n n y
N samples :
d
Where y = + /- 1 are labels for the data.
Can we find a hyperplane that separates the two classes? (labeled by y) i.e.
: For all j such that y = + 1
: For all j such that y = -1
W hi ch we wi l l r el ax l at er !
Rel ax now! ! ☺
Lets assume that we are looking for a
“Homogenize” the coordinates by
Think of it as moving the whole red and
From 2D to 3D it is just the x-y plane
Rel ax now! ☺
Assume all points on a unit sphere! If they are not after applying
Given: A set of points on a sphere in d-dimensions,
Output: Find one such halfspace Note: You can solve the LP feasibility problem.
Take Est i e’ s cl ass i f you W ant t o know why. ☺
Given a convex body (in V-form), find a
Fr om l ear ni ng t heor y, m axi m um m ar gi n i s good
M ar gi n
Unl i ke Per cept r ons SVM s have a uni que sol ut i on but ar e har der t o sol ve. <Q P>
There are very simple algorithms to
So how do we solve the LP ?
Simplex Ellipsoid IP methods Perceptrons = Gradient Decent
You can write an LP solver in 5 mins ! A very slight modification can give u a
Multiple perceptrons clubbed together are
Perceptrons have a finite capacity and so
Fr om l ear ni ng t heor y, l i m i t ed capaci t y i s good
If the data is separable with say a
Del aunay! ??
Li f t t he poi nt s t o a par abol oi d i n one hi gher di m ensi on, For i nst ance i f t he dat a i s i n 2D, ( x, y) - > ( x, y, x2+y 2)
Another trick that ML community uses for
Example : There are even papers on how to learn
2
|| || /2
x z
σ − −
Let L be a l i near pr ogr am and l et L’ be t he sam e l i near pr ogr am under a G aussi an per t ur bat i on of var i ance si gm a2, wher e si gm a2 <= 1/ 2d. For any del t a, wi t h pr obabi l i t y at l east 1 – del t a ei t her
Start with a random vector w, and if a
( unt i l done) One of the most beautiful LP Solvers I’ve ever come across…
That ’ s t he ent i r e code! W r i t t en i n 10 m i ns.
f unct i on w = per cept r on( r , b) r = [ r ( zer os( l engt h( r ) , 1) +1) ] ; % Hom
b = - [ b ( zer os( l engt h( b) , 1) +1) ] ; % Hom
dat a = [ r ; b] ; % M ake one poi nt set s = si ze( dat a) ; % Si ze of dat a? w = zer os( 1, s( 1, 2) ) ; % I ni t i al i ze zer o vect or i s_er r or = t r ue; whi l e i s_er r or i s_er r or = f al se; f or k=1: s( 1, 1) i f dot ( w, dat a( k, : ) ) <= 0 w = w+dat a( k, : ) ; i s_er r or = t r ue; end end end
And i t can be sol ve any LP!
The m at h behi nd…
The Conver gence Pr oof