How to Train Your Perceptron 16-385 Computer Vision (Kris Kitani) - PowerPoint PPT Presentation

PERCEPTRON How to Train Your Perceptron 16-385 Computer Vision (Kris Kitani) Carnegie Mellon University

Let’s start easy

world’s smallest perceptron! w f y x y = wx (a.k.a. line equation, linear regression)

Learning a Perceptron Given a set of samples and a Perceptron { x i , y i } y = f PER ( x ; w ) Estimate the parameters of the Perceptron w

Given training data: y x 10 10.1 2 1.9 3.5 3.4 1 1.1 What do you think the weight parameter is? y = wx

Given training data: y x 10 10.1 2 1.9 3.5 3.4 1 1.1 What do you think the weight parameter is? y = wx not so obvious as the network gets more complicated so we use …

An Incremental Learning Strategy   (gradient descent) Given several examples { ( x 1 , y 1 ) , ( x 2 , y 2 ) , . . . , ( x N , y N ) } and a perceptron y = wx ˆ

An Incremental Learning Strategy   (gradient descent) Given several examples { ( x 1 , y 1 ) , ( x 2 , y 2 ) , . . . , ( x N , y N ) } and a perceptron y = wx ˆ ˆ y Modify weight such that gets ‘closer’ to w y

An Incremental Learning Strategy   (gradient descent) Given several examples { ( x 1 , y 1 ) , ( x 2 , y 2 ) , . . . , ( x N , y N ) } and a perceptron y = wx ˆ ˆ y Modify weight such that gets ‘closer’ to w y perceptron perceptron true parameter output label

An Incremental Learning Strategy   (gradient descent) Given several examples { ( x 1 , y 1 ) , ( x 2 , y 2 ) , . . . , ( x N , y N ) } and a perceptron y = wx ˆ ˆ y Modify weight such that gets ‘closer’ to w y perceptron perceptron what does true parameter output this mean? label

Before diving into gradient descent, we need to understand … Loss Function defines what is means to be close to the true solution YOU get to chose the loss function! (some are better than others depending on what you want to do)

Squared Error (L2) (a popular loss function) 3 ` 2 y − y ) 2 ` (ˆ y, y ) = (ˆ 1 -2 -1 0 1 2 (ˆ y − y )

L1 Loss L2 Loss y − y ) 2 ` (ˆ y, y ) = | ˆ y − y | ` (ˆ y, y ) = (ˆ 3 3 2 2 1 1 -2 -1 0 1 2 -2 -1 0 1 2 Zero-One Loss Hinge Loss ` (ˆ y, y ) = max(0 , 1 − y · ˆ y ) ` (ˆ y, y ) = 1 [ˆ y = y ] 3 3 2 2 1 1 -2 -1 0 1 2 -2 -1 0 1 2

back to the… World’s Smallest Perceptron! w f y x y = wx (a.k.a. line equation, linear regression) function of ONE parameter!

Learning a Perceptron Given a set of samples and a Perceptron { x i , y i } y = f PER ( x ; w ) what is this activation function? Estimate the parameter of the Perceptron w

Learning a Perceptron Given a set of samples and a Perceptron { x i , y i } y = f PER ( x ; w ) what is this f ( x ) = wx linear function! activation function? Estimate the parameter of the Perceptron w

Learning Strategy   (gradient descent) Given several examples { ( x 1 , y 1 ) , ( x 2 , y 2 ) , . . . , ( x N , y N ) } and a perceptron y = wx ˆ ˆ y Modify weight such that gets ‘closer’ to w y perceptron perceptron true parameter output label

Let’s demystify this process first… Code to train your perceptron:

Let’s demystify this process first… Code to train your perceptron: for n = 1 . . . N w = w + ( y n − ˆ y ) x i ; just one line of code!

Let’s demystify this process first… Code to train your perceptron: for n = 1 . . . N w = w + ( y n − ˆ y ) x i ; just one line of code! Now where does this come from?

How to Train Your Perceptron 16-385 Computer Vision (Kris Kitani) - PowerPoint PPT Presentation

PERCEPTRON How to Train Your Perceptron 16-385 Computer Vision (Kris Kitani) Carnegie Mellon University Lets start easy worlds smallest perceptron! w f y x y = wx (a.k.a. line equation, linear regression) Learning a Perceptron

23 Advanced Topics 5: Multi-lingual Models Up until now, we have assumed that in the case of

CS 472 - Perceptron 1 Basic Neuron CS 472 - Perceptron 2 Expanded Neuron CS 472 - Perceptron

The Perceptron Algorithm Machine Learning 1 Some slides based on lectures from Dan Roth, Avrim

Structured Perceptron CMSC 470 Marine Carpuat POS tagging Sequence labeling with the perceptron

TOS Arno Puder 1 Objectives Introduce the train simulator Using the model train

Introduction to Machine Learning Perceptron Barnabs Pczos Contents History of Artificial

The Perceptron Mistake Bound Machine Learning 1 Some slides based on lectures from Dan Roth,

Machine Learning A Geometric Approach Linear Classification: Perceptron Professor Liang Huang

A-train Commuter Rail Updated July 31, 2018 Presentation Overview DCTA A-train Commuter Rail

Bethesda Big Train Partnership Presentation What is Big Train? Bethesda Big Train is a summer

Antwerp 50 by train Ghent 40 by

TRISTAN 2016, Aruba, June 2016 1 Real-time train rescheduling Train scheduling : routing and

The Perceptron Algorithm Perceptron (Frank Rosenblatt, 1957) First learning algorithm for

Lecture 3: Perceptron Princeton University COS 495 Instructor: Yingyu Liang Perceptron Overview

Supervised Classification with Logistic Regression CMSC 470 Marine Carpuat The Perceptron What

Perceptron Homework Assume a 3 input perceptron plus bias (it outputs 1 if net > 0, else 0) l

A constructive approach to incremental learning Mario Rosario Guarracino October 12, 2006

A Novel Layer Sharing-based Incremental Learning via Bayesian Optimization Bomi Kim, Taehyeon

Convolutional Prototype Ensemble Robust Stream Classification & Novel Class Detection Zhuoyi

PREserving Linked DAta: An introduc7on Carlo Meghini ISTI

Incremental Classification: First Step into Lifelong Learning PAN Xinyu MMLab, Department of IE

L101: Incremental structured prediction Structured prediction reminder Given an input x (e.g. a

Risk, Minimum Risk Training, Reinforcement Learning Graham Neubig Site

Learning Agent Learning Agents An Agent that observes its performance and adapts its

How to Train Your Perceptron 16-385 Computer Vision (Kris Kitani) - PowerPoint PPT Presentation

PERCEPTRON How to Train Your Perceptron 16-385 Computer Vision (Kris Kitani) Carnegie Mellon University Lets start easy worlds smallest perceptron! w f y x y = wx (a.k.a. line equation, linear regression) Learning a Perceptron

23 Advanced Topics 5: Multi-lingual Models Up until now, we have assumed that in the case of

CS 472 - Perceptron 1 Basic Neuron CS 472 - Perceptron 2 Expanded Neuron CS 472 - Perceptron

The Perceptron Algorithm Machine Learning 1 Some slides based on lectures from Dan Roth, Avrim

Structured Perceptron CMSC 470 Marine Carpuat POS tagging Sequence labeling with the perceptron

TOS Arno Puder 1 Objectives Introduce the train simulator Using the model train

Introduction to Machine Learning Perceptron Barnabs Pczos Contents History of Artificial

The Perceptron Mistake Bound Machine Learning 1 Some slides based on lectures from Dan Roth,

Machine Learning A Geometric Approach Linear Classification: Perceptron Professor Liang Huang

A-train Commuter Rail Updated July 31, 2018 Presentation Overview DCTA A-train Commuter Rail

Bethesda Big Train Partnership Presentation What is Big Train? Bethesda Big Train is a summer

Antwerp 50 by train Ghent 40 by

TRISTAN 2016, Aruba, June 2016 1 Real-time train rescheduling Train scheduling : routing and

The Perceptron Algorithm Perceptron (Frank Rosenblatt, 1957) First learning algorithm for

Lecture 3: Perceptron Princeton University COS 495 Instructor: Yingyu Liang Perceptron Overview

Supervised Classification with Logistic Regression CMSC 470 Marine Carpuat The Perceptron What

Perceptron Homework Assume a 3 input perceptron plus bias (it outputs 1 if net &gt; 0, else 0) l

A constructive approach to incremental learning Mario Rosario Guarracino October 12, 2006

A Novel Layer Sharing-based Incremental Learning via Bayesian Optimization Bomi Kim, Taehyeon

Convolutional Prototype Ensemble Robust Stream Classification &amp; Novel Class Detection Zhuoyi

PREserving Linked DAta: An introduc7on Carlo Meghini ISTI

Incremental Classification: First Step into Lifelong Learning PAN Xinyu MMLab, Department of IE

L101: Incremental structured prediction Structured prediction reminder Given an input x (e.g. a

Risk, Minimum Risk Training, Reinforcement Learning Graham Neubig Site

Learning Agent Learning Agents An Agent that observes its performance and adapts its

Perceptron Homework Assume a 3 input perceptron plus bias (it outputs 1 if net > 0, else 0) l

Convolutional Prototype Ensemble Robust Stream Classification & Novel Class Detection Zhuoyi