how to train your perceptron
play

How to Train Your Perceptron 16-385 Computer Vision (Kris Kitani) - PowerPoint PPT Presentation

PERCEPTRON How to Train Your Perceptron 16-385 Computer Vision (Kris Kitani) Carnegie Mellon University Lets start easy worlds smallest perceptron! w f y x y = wx (a.k.a. line equation, linear regression) Learning a Perceptron


  1. PERCEPTRON How to Train Your Perceptron 16-385 Computer Vision (Kris Kitani) Carnegie Mellon University

  2. Let’s start easy

  3. world’s smallest perceptron! w f y x y = wx (a.k.a. line equation, linear regression)

  4. Learning a Perceptron Given a set of samples and a Perceptron { x i , y i } y = f PER ( x ; w ) Estimate the parameters of the Perceptron w

  5. Given training data: y x 10 10.1 2 1.9 3.5 3.4 1 1.1 What do you think the weight parameter is? y = wx

  6. Given training data: y x 10 10.1 2 1.9 3.5 3.4 1 1.1 What do you think the weight parameter is? y = wx not so obvious as the network gets more complicated so we use …

  7. An Incremental Learning Strategy 
 (gradient descent) Given several examples { ( x 1 , y 1 ) , ( x 2 , y 2 ) , . . . , ( x N , y N ) } and a perceptron y = wx ˆ

  8. An Incremental Learning Strategy 
 (gradient descent) Given several examples { ( x 1 , y 1 ) , ( x 2 , y 2 ) , . . . , ( x N , y N ) } and a perceptron y = wx ˆ ˆ y Modify weight such that gets ‘closer’ to w y

  9. An Incremental Learning Strategy 
 (gradient descent) Given several examples { ( x 1 , y 1 ) , ( x 2 , y 2 ) , . . . , ( x N , y N ) } and a perceptron y = wx ˆ ˆ y Modify weight such that gets ‘closer’ to w y perceptron perceptron true parameter output label

  10. An Incremental Learning Strategy 
 (gradient descent) Given several examples { ( x 1 , y 1 ) , ( x 2 , y 2 ) , . . . , ( x N , y N ) } and a perceptron y = wx ˆ ˆ y Modify weight such that gets ‘closer’ to w y perceptron perceptron what does true parameter output this mean? label

  11. Before diving into gradient descent, we need to understand … Loss Function defines what is means to be close to the true solution YOU get to chose the loss function! (some are better than others depending on what you want to do)

  12. Squared Error (L2) (a popular loss function) 3 ` 2 y − y ) 2 ` (ˆ y, y ) = (ˆ 1 -2 -1 0 1 2 (ˆ y − y )

  13. L1 Loss L2 Loss y − y ) 2 ` (ˆ y, y ) = | ˆ y − y | ` (ˆ y, y ) = (ˆ 3 3 2 2 1 1 -2 -1 0 1 2 -2 -1 0 1 2 Zero-One Loss Hinge Loss ` (ˆ y, y ) = max(0 , 1 − y · ˆ y ) ` (ˆ y, y ) = 1 [ˆ y = y ] 3 3 2 2 1 1 -2 -1 0 1 2 -2 -1 0 1 2

  14. back to the… World’s Smallest Perceptron! w f y x y = wx (a.k.a. line equation, linear regression) function of ONE parameter!

  15. Learning a Perceptron Given a set of samples and a Perceptron { x i , y i } y = f PER ( x ; w ) what is this activation function? Estimate the parameter of the Perceptron w

  16. Learning a Perceptron Given a set of samples and a Perceptron { x i , y i } y = f PER ( x ; w ) what is this f ( x ) = wx linear function! activation function? Estimate the parameter of the Perceptron w

  17. Learning Strategy 
 (gradient descent) Given several examples { ( x 1 , y 1 ) , ( x 2 , y 2 ) , . . . , ( x N , y N ) } and a perceptron y = wx ˆ ˆ y Modify weight such that gets ‘closer’ to w y perceptron perceptron true parameter output label

  18. Let’s demystify this process first… Code to train your perceptron:

  19. Let’s demystify this process first… Code to train your perceptron: for n = 1 . . . N w = w + ( y n − ˆ y ) x i ; just one line of code!

  20. Let’s demystify this process first… Code to train your perceptron: for n = 1 . . . N w = w + ( y n − ˆ y ) x i ; just one line of code! Now where does this come from?

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend