How to Train Your Perceptron 16-385 Computer Vision (Kris Kitani) - - PowerPoint PPT Presentation

how to train your perceptron
SMART_READER_LITE
LIVE PREVIEW

How to Train Your Perceptron 16-385 Computer Vision (Kris Kitani) - - PowerPoint PPT Presentation

PERCEPTRON How to Train Your Perceptron 16-385 Computer Vision (Kris Kitani) Carnegie Mellon University Lets start easy worlds smallest perceptron! w f y x y = wx (a.k.a. line equation, linear regression) Learning a Perceptron


slide-1
SLIDE 1

How to Train Your Perceptron

16-385 Computer Vision (Kris Kitani)

Carnegie Mellon University

PERCEPTRON

slide-2
SLIDE 2

Let’s start easy

slide-3
SLIDE 3

y

world’s smallest perceptron!

x

(a.k.a. line equation, linear regression)

w y = wx f

slide-4
SLIDE 4

Given a set of samples and a Perceptron Estimate the parameters of the Perceptron

Learning a Perceptron

{xi, yi}

y = fPER(x; w)

w

slide-5
SLIDE 5

What do you think the weight parameter is?

x

y

y = wx

1 1.1 2 1.9 3.5 3.4 10 10.1 Given training data:

slide-6
SLIDE 6

What do you think the weight parameter is?

x

y

y = wx

1 1.1 2 1.9 3.5 3.4 10 10.1 Given training data:

not so obvious as the network gets more complicated so we use …

slide-7
SLIDE 7

Given several examples

{(x1, y1), (x2, y2), . . . , (xN, yN)}

ˆ y = wx

An Incremental Learning Strategy


(gradient descent) and a perceptron

slide-8
SLIDE 8

Given several examples

Modify weight such that

{(x1, y1), (x2, y2), . . . , (xN, yN)}

ˆ y = wx

ˆ y y

gets ‘closer’ to

w

and a perceptron

An Incremental Learning Strategy


(gradient descent)

slide-9
SLIDE 9

Given several examples

Modify weight such that

{(x1, y1), (x2, y2), . . . , (xN, yN)}

ˆ y = wx

ˆ y y

gets ‘closer’ to

w

and a perceptron

perceptron

  • utput

true label perceptron parameter

An Incremental Learning Strategy


(gradient descent)

slide-10
SLIDE 10

Given several examples

Modify weight such that

{(x1, y1), (x2, y2), . . . , (xN, yN)}

ˆ y = wx

ˆ y y

gets ‘closer’ to

w

and a perceptron

perceptron

  • utput

true label perceptron parameter

An Incremental Learning Strategy


(gradient descent)

what does this mean?

slide-11
SLIDE 11

Loss Function defines what is means to be close to the true solution YOU get to chose the loss function!

(some are better than others depending on what you want to do)

Before diving into gradient descent, we need to understand …

slide-12
SLIDE 12

Squared Error (L2)

(a popular loss function)

  • 2
  • 1

1 2 1 2 3 `

(ˆ y − y)

`(ˆ y, y) = (ˆ y − y)2

slide-13
SLIDE 13
  • 2
  • 1

1 2 1 2 3

  • 2
  • 1

1 2 1 2 3

  • 2
  • 1

1 2 1 2 3

  • 2
  • 1

1 2 1 2 3

L1 Loss L2 Loss Zero-One Loss Hinge Loss `(ˆ y, y) = |ˆ y − y| `(ˆ y, y) = (ˆ y − y)2 `(ˆ y, y) = 1[ˆ y = y] `(ˆ y, y) = max(0, 1 − y · ˆ y)

slide-14
SLIDE 14

World’s Smallest Perceptron!

y = wx back to the… function of ONE parameter!

(a.k.a. line equation, linear regression)

y x w f

slide-15
SLIDE 15

Given a set of samples and a Perceptron Estimate the parameter of the Perceptron

Learning a Perceptron

{xi, yi}

y = fPER(x; w)

w

what is this activation function?

slide-16
SLIDE 16

Given a set of samples and a Perceptron Estimate the parameter of the Perceptron

Learning a Perceptron

{xi, yi}

y = fPER(x; w)

w

what is this activation function?

f(x) = wx

linear function!

slide-17
SLIDE 17

Given several examples

Modify weight such that

{(x1, y1), (x2, y2), . . . , (xN, yN)}

ˆ y = wx

ˆ y y

gets ‘closer’ to

Learning Strategy


(gradient descent)

w

and a perceptron

perceptron

  • utput

true label perceptron parameter

slide-18
SLIDE 18

Let’s demystify this process first…

Code to train your perceptron:

slide-19
SLIDE 19

Let’s demystify this process first…

Code to train your perceptron:

just one line of code!

for n = 1 . . . N w = w + (yn − ˆ y)xi;

slide-20
SLIDE 20

Let’s demystify this process first…

Code to train your perceptron:

just one line of code! Now where does this come from?

for n = 1 . . . N w = w + (yn − ˆ y)xi;