Perceptrons Jonathan Mugan jonathanwilliammugan@gmail.com - PowerPoint PPT Presentation

Perceptrons Jonathan Mugan jonathanwilliammugan@gmail.com www.jonathanmugan.com @jmugan April 10, 2014 (Slides taken from Dan Klein)

Classification: Feature Vectors Hello, # free : 2 SPAM YOUR_NAME : 0 Do you want free printr MISSPELLED : 2 or cartriges? Why pay more FROM_FRIEND : 0 when you can get them ... + ABSOLUTELY FREE! Just PIXEL-7,12 : 1 “2” PIXEL-7,13 : 0 ... NUM_LOOPS : 1 ... This slide deck courtesy of Dan Klein at UC Berkeley

Some (Simplified) Biology  Very loose inspiration: human neurons 2

Linear Classifiers  Inputs are feature values  Each feature has a weight  Sum is the activation  If the activation is: w 1 f 1 Σ  Positive, output +1 w 2 >0? f 2 w 3  Negative, output -1 f 3 3

Example: Spam  Imagine 4 features (spam is “positive” class):  free (number of occurrences of “free”)  money (occurrences of “money”)  BIAS (intercept, always has value 1) BIAS : 1 BIAS : -3 free : 1 free : 4 “free money” money : 1 money : 2 ... ...

Classification: Weights  Binary case: compare features to a weight vector  Learning: figure out the weight vector from examples # free : 4 YOUR_NAME :-1 # free : 2 MISSPELLED : 1 YOUR_NAME : 0 FROM_FRIEND :-3 MISSPELLED : 2 ... FROM_FRIEND : 0 ... # free : 0 YOUR_NAME : 1 MISSPELLED : 1 Dot product positive FROM_FRIEND : 1 means the positive class ...

Binary Decision Rule  In the space of feature vectors  Examples are points  Any weight vector is a hyperplane  One side corresponds to Y=+1  Other corresponds to Y=-1 money 2 +1 = SPAM 1 BIAS : -3 free : 4 money : 2 0 ... -1 = HAM 0 1 free

Mistake-Driven Classification  For Naïve Bayes:  Parameters from data statistics  Parameters: causal interpretation Training Data  Training: one pass through the data  For the perceptron:  Parameters from reactions to mistakes Held-Out Data  Prameters: discriminative interpretation  Training: go through the data until held- Test out accuracy maxes out Data 7

Learning: Binary Perceptron  Start with weights = 0  For each training instance:  Classify with current weights  If correct (i.e., y=y*), no change!  If wrong: adjust the weight vector by adding or subtracting the feature vector. Subtract if y* is -1. 8

Multiclass Decision Rule  If we have more than two classes:  Have a weight vector for each class:  Calculate an activation for each class  Highest activation wins 9

Multiclass Decision Rule  If we have multiple classes:  A weight vector for each class:  Score (activation) of a class y:  Prediction highest score wins Binary = multiclass where the negative class has weight zero

Example BIAS : 1 win : 1 “win the vote” game : 0 vote : 1 the : 1 ... BIAS : -2 BIAS : 1 BIAS : 2 win : 4 win : 2 win : 0 game : 4 game : 0 game : 2 vote : 0 vote : 4 vote : 0 the : 0 the : 0 the : 0 ... ... ...

Learning: Multiclass Perceptron  Start with all weights = 0  Pick up training examples one by one  Predict with current weights  If correct, no change!  If wrong: lower score of wrong answer, raise score of right answer 12

Example: Multiclass Perceptron “win the vote” “win the election” “win the game” BIAS : 1 BIAS : 0 BIAS : 0 win : 0 win : 0 win : 0 game : 0 game : 0 game : 0 vote : 0 vote : 0 vote : 0 the : 0 the : 0 the : 0 ... ... ...

Examples: Perceptron  Separable Case 14

Examples: Perceptron  Separable Case 15

Properties of Perceptrons  Separability: some parameters get Separable the training set perfectly correct  Convergence: if the training is separable, perceptron will eventually converge (binary case)  Mistake Bound: the maximum Non-Separable number of mistakes (binary case) related to the margin or degree of separability 16

Examples: Perceptron  Non-Separable Case 17

Examples: Perceptron  Non-Separable Case 18

Problems with the Perceptron  Noise: if the data isn’t separable, weights might thrash  Averaging weight vectors over time can help (averaged perceptron)  Mediocre generalization: finds a “barely” separating solution  Overtraining: test / held-out accuracy usually rises, then falls  Overtraining is a kind of overfitting

Perceptrons Jonathan Mugan jonathanwilliammugan@gmail.com - PowerPoint PPT Presentation

Perceptrons Jonathan Mugan jonathanwilliammugan@gmail.com www.jonathanmugan.com @jmugan April 10, 2014 (Slides taken from Dan Klein) Classification: Feature Vectors Hello, # free : 2 SPAM YOUR_NAME : 0 Do you want free printr

Perceptrons Introduction: Neural Networks 1 The Perceptron 2 Using Perceptrons Perceptrons

Perceptrons Steven J Zeil Old Dominion Univ. Fall 2010 1 Introduction: Neural Networks The

CSC321 Lecture 5: Multilayer Perceptrons Roger Grosse Roger Grosse CSC321 Lecture 5: Multilayer

Machine Learning and Data Mining Multi-layer Perceptrons & Neural Networks: Basics Kalev

CS7015 (Deep Learning) : Lecture 2 McCulloch Pitts Neuron, Thresholding Logic, Perceptrons,

Perceptrons Sven Koenig, USC Russell and Norvig, 3 rd Edition, Sections 18.7.1-18.7.4 These

CSC421/2516 Lecture 3: Multilayer Perceptrons Roger Grosse and Jimmy Ba Roger Grosse and Jimmy

Perceptrons Barna Saha The Machine Learning Model Training set: A training set consists of a

Neural networks Chapter 20, Section 5 Chapter 20, Section 5 1 Outline Brains Neural

Machine Learning & Neural Networks CS16: Introduction to Data Structures & Algorithms

CMP784 DEEP LEARNING Lecture #03 Multi-layer Perceptrons Aykut Erdem // Hacettepe University

CS480/680 Lecture 9: June 5, 2019 Perceptrons, Neural Networks [D] Chapt. 4, [HTF] Chapt. 11,

Linear Classifiers CS 188: Artificial Intelligence Perceptrons and Logistic Regression Pieter

CS485/685 Lecture 7: Jan 24, 2012 Perceptrons, Neural Networks [B]: Sections 4.1.7, 5.1 CS485/685

Intro to Artificial Neural Networks Oscar Maas @oscmansan Outline 1. Perceptrons 2.

Introduction to Neural Networks Slides from L. Lazebnik, B. Hariharan Outline Perceptrons

Learning a Distance Metric for Structured Network Prediction Stuart Andrews and Tony Jebara

MIRA, SVM, k-NN Lirong Xia Linear Classifiers (perceptrons) Inputs are feature values

Does Training Affect Match Performance? A Study Using Data Mining And Tracking Devices

CSE 573: Artificial Intelligence Autumn 2010 Lecture 16: Machine Learning Topics 12/7/2010 Luke

CS 4100: Artificial Intelligence Perceptrons and Logistic Regression Jan-Willem van de Meent,

Search for top Squarks Using Multivariate Methods Jonas Graw Max Planck Institute for Physics

x 2 > b w T x > 0 SPAM!! x ( x , 1) w 3 x 3 w T x + b ( w , b ) T ( x , 1)

First Observation of Single Top Quark Production at D Monica Pangilinan Brown University on

Perceptrons Jonathan Mugan jonathanwilliammugan@gmail.com - PowerPoint PPT Presentation

Perceptrons Jonathan Mugan jonathanwilliammugan@gmail.com www.jonathanmugan.com @jmugan April 10, 2014 (Slides taken from Dan Klein) Classification: Feature Vectors Hello, # free : 2 SPAM YOUR_NAME : 0 Do you want free printr

Perceptrons Introduction: Neural Networks 1 The Perceptron 2 Using Perceptrons Perceptrons

Perceptrons Steven J Zeil Old Dominion Univ. Fall 2010 1 Introduction: Neural Networks The

CSC321 Lecture 5: Multilayer Perceptrons Roger Grosse Roger Grosse CSC321 Lecture 5: Multilayer

Machine Learning and Data Mining Multi-layer Perceptrons &amp; Neural Networks: Basics Kalev

CS7015 (Deep Learning) : Lecture 2 McCulloch Pitts Neuron, Thresholding Logic, Perceptrons,

Perceptrons Sven Koenig, USC Russell and Norvig, 3 rd Edition, Sections 18.7.1-18.7.4 These

CSC421/2516 Lecture 3: Multilayer Perceptrons Roger Grosse and Jimmy Ba Roger Grosse and Jimmy

Perceptrons Barna Saha The Machine Learning Model Training set: A training set consists of a

Neural networks Chapter 20, Section 5 Chapter 20, Section 5 1 Outline Brains Neural

Machine Learning &amp; Neural Networks CS16: Introduction to Data Structures &amp; Algorithms

CMP784 DEEP LEARNING Lecture #03 Multi-layer Perceptrons Aykut Erdem // Hacettepe University

CS480/680 Lecture 9: June 5, 2019 Perceptrons, Neural Networks [D] Chapt. 4, [HTF] Chapt. 11,

Linear Classifiers CS 188: Artificial Intelligence Perceptrons and Logistic Regression Pieter

CS485/685 Lecture 7: Jan 24, 2012 Perceptrons, Neural Networks [B]: Sections 4.1.7, 5.1 CS485/685

Intro to Artificial Neural Networks Oscar Maas @oscmansan Outline 1. Perceptrons 2.

Introduction to Neural Networks Slides from L. Lazebnik, B. Hariharan Outline Perceptrons

Learning a Distance Metric for Structured Network Prediction Stuart Andrews and Tony Jebara

MIRA, SVM, k-NN Lirong Xia Linear Classifiers (perceptrons) Inputs are feature values

Does Training Affect Match Performance? A Study Using Data Mining And Tracking Devices

CSE 573: Artificial Intelligence Autumn 2010 Lecture 16: Machine Learning Topics 12/7/2010 Luke

CS 4100: Artificial Intelligence Perceptrons and Logistic Regression Jan-Willem van de Meent,

Search for top Squarks Using Multivariate Methods Jonas Graw Max Planck Institute for Physics

x 2 &gt; b w T x &gt; 0 SPAM!! x ( x , 1) w 3 x 3 w T x + b ( w , b ) T ( x , 1)

First Observation of Single Top Quark Production at D Monica Pangilinan Brown University on

Machine Learning and Data Mining Multi-layer Perceptrons & Neural Networks: Basics Kalev

Machine Learning & Neural Networks CS16: Introduction to Data Structures & Algorithms

x 2 > b w T x > 0 SPAM!! x ( x , 1) w 3 x 3 w T x + b ( w , b ) T ( x , 1)