the perceptron algorithm
play

The Perceptron Algorithm Perceptron (Frank Rosenblatt, 1957) First - PDF document

9/14/10 The Perceptron Algorithm Perceptron (Frank Rosenblatt, 1957) First learning algorithm for neural networks; Originally introduced for character classification, where each character is represented as an image; 1 9/14/10


  1. 9/14/10 The Perceptron Algorithm Perceptron (Frank Rosenblatt, 1957) • First learning algorithm for neural networks; • Originally introduced for character classification, where each character is represented as an image; 1

  2. 9/14/10 Perceptron (contd.) n Total input to output node: w j x ∑ j j 1 = 1 if x 0 ≥  ( ) H x Output unit performs the =  0 if x 0 < function: (activation function):  Perceptron: Learning Algorithm • Goal : we want to define a learning algorithm for the weights in order to compute a mapping from the inputs to the outputs; • Example : two class character recognition problem. – Training set : set of images representing either the character ‘a’ or the character ‘b’ (supervised learning); – Learning Task : Learn the weights so that when a new unlabelled image comes in, the network can predict its label. – Settings: The perceptron Class ‘a’  1 (class C1) needs to learn Class ‘b’  0 (class C2) ℜ n { } n input units (intensity level of a pixel) f : 0 , 1 → 1 output unit 2

  3. 9/14/10 Perceptron: Learning Algorithm The algorithm proceeds as follows : • Initial random setting of weights; • The input is a random sequence { } ℵ x k k ∈ • For each element of class C1, if output = 1 (correct) do nothing , otherwise update weights ; • For each element of class C2, if output = 0 (correct) do nothing , otherwise update weights . Perceptron: Learning Algorithm A bit more formally: ( ) ( ) x x , x ,..., x w w , w ,..., w = = 1 2 n 1 2 n : θ Threshold of the output unit T wx w x w x ... w x = + + + 1 1 2 2 n n T wx 0 − θ ≥ Output is 1 if To eliminate the explicit dependence on : θ Output is 1 if: n 1 + ˆ ˆ T w x w x 0 ∑ = ≥ i i i 1 = 3

  4. 9/14/10 Perceptron: Learning Algorithm • We want to learn values of the weights so that the perceptron correctly discriminate elements of C1 from elements of C2: • Given x in input, if x is classified correctly, weights are unchanged, otherwise: w x if an elem ent of cla ss C ( 1 ) was classi fied as in C +  1 2 ' w =  w x if an elem ent of cla ss C ( 0 ) was classi fied as in C −  2 1 Perceptron: Learning Algorithm w x if an elem ent of cla ss C ( 1 ) was classi fied as in C  + ' 1 2 w =  w x if an elem ent of cla ss C ( 0 ) was classi fied as in C −  2 1 • 1 st case : x ∈ C and was classified in C 1 2 ˆ ˆ T w x 0 The correct answer is 1, which corresponds to: ≥ ˆ ˆ T w x 0 We have instead: < We want to get closer to the correct answer: T ' T wx w x < T T T ' T wx ( w x ) x wx w x iff < < + 2 ( ) T T T T w x x wx xx wx x + = + = + 2 ≥ because x 0 , the condit ion is ver ified 4

  5. 9/14/10 Perceptron: Learning Algorithm w x if an elem ent of cla ss C ( 1 ) was classi fied as in C +  1 2 ' w =  w x if an elem ent of cla ss C ( 0 ) was classi fied as in C −  2 1 • 2 nd case : x ∈ C 2 and was classified in C 1 The correct answer is 0, which corresponds to: ˆ ˆ T w x 0 < ˆ ˆ T We have instead: w x 0 ≥ We want to get closer to the correct answer: T ' T wx w x > T ' T T T wx w x wx ( w x ) x iff > > − 2 ( ) T T T T w x x wx xx wx x − = − = − 2 ≥ because x 0 , the condit ion is ver ified The previous rule allows the network to get closer to the correct answer when it performs an error. Perceptron: Learning Algorithm • In summary : 1. A random sequence is generated x , x ,  , x ,  1 2 k such that x C C ∈ ∪ i 1 2 2. If is correctly classified, then x w w = k k + 1 k otherwise w x if x C + ∈  k k k 1 w =  k 1 + w x if x C − ∈  k k k 2 5

  6. 9/14/10 Perceptron: Learning Algorithm Does the learning algorithm converge? Convergence theorem: Regardless of the initial choice of weights, if the two classes are linearly separable, i.e. there exist s.t. w ˆ ˆ T w x 0 if x C  ≥ ∈  1  ˆ ˆ T w x 0 if x C < ∈   2 then the learning rule will find such solution after a finite number of steps. Representational Power of Perceptrons • Marvin Minsky and Seymour Papert, “Perceptrons” 1969: “The perceptron can solve only problems with linearly separable classes.” • Examples of linearly separable Boolean functions: AND OR 6

  7. 9/14/10 Representational Power of Perceptrons 1 1 -1.5 -0.5 1 1 Perceptron that computes the Perceptron that computes the AND function OR function Representational Power of Perceptrons • Example of a non linearly separable Boolean function: EX-OR The EX-OR function cannot be computed by a perceptron 7

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend