Linear Classifiers CS 188: Artificial Intelligence Perceptrons and - PowerPoint PPT Presentation

Linear Classifiers CS 188: Artificial Intelligence Perceptrons and Logistic Regression Pieter Abbeel & Dan Klein University of California, Berkeley Feature Vectors Some (Simplified) Biology § Very loose inspiration: human neurons Hello, SPAM # free : 2 YOUR_NAME : 0 Do you want free printr MISSPELLED : 2 or cartriges? Why pay more FROM_FRIEND : 0 when you can get them ... + ABSOLUTELY FREE! Just PIXEL-7,12 : 1 “2” PIXEL-7,13 : 0 ... NUM_LOOPS : 1 ...

Linear Classifiers Weights § Binary case: compare features to a weight vector § Inputs are feature values § Learning: figure out the weight vector from examples § Each feature has a weight § Sum is the activation # free : 4 YOUR_NAME :-1 # free : 2 MISSPELLED : 1 YOUR_NAME : 0 FROM_FRIEND :-3 MISSPELLED : 2 ... FROM_FRIEND : 0 ... § If the activation is: w 1 f 1 S § Positive, output +1 w 2 # free : 0 >0? f 2 YOUR_NAME : 1 w 3 MISSPELLED : 1 § Negative, output -1 Dot product positive f 3 FROM_FRIEND : 1 means the positive class ... Decision Rules Binary Decision Rule § In the space of feature vectors § Examples are points § Any weight vector is a hyperplane § One side corresponds to Y=+1 § Other corresponds to Y=-1 money 2 +1 = SPAM 1 BIAS : -3 free : 4 money : 2 0 ... -1 = HAM 0 1 free

Weight Updates Learning: Binary Perceptron § Start with weights = 0 § For each training instance: § Classify with current weights § If correct (i.e., y=y*), no change! § If wrong: adjust the weight vector Learning: Binary Perceptron Examples: Perceptron § Start with weights = 0 § Separable Case § For each training instance: § Classify with current weights § If correct (i.e., y=y*), no change! § If wrong: adjust the weight vector by adding or subtracting the feature vector. Subtract if y* is -1.

Multiclass Decision Rule Learning: Multiclass Perceptron § Start with all weights = 0 § If we have multiple classes: § Pick up training examples one by one § A weight vector for each class: § Predict with current weights § Score (activation) of a class y: § If correct, no change! § If wrong: lower score of wrong answer, raise score of right answer § Prediction highest score wins Binary = multiclass where the negative class has weight zero Example: Multiclass Perceptron Properties of Perceptrons Separable “win the vote” § Separability: true if some parameters get the training set perfectly correct “win the election” § Convergence: if the training is separable, perceptron will “win the game” eventually converge (binary case) § Mistake Bound: the maximum number of mistakes (binary Non-Separable case) related to the margin or degree of separability BIAS : 1 BIAS : 0 BIAS : 0 win : 0 win : 0 win : 0 game : 0 game : 0 game : 0 vote : 0 vote : 0 vote : 0 the : 0 the : 0 the : 0 ... ... ...

Problems with the Perceptron Improving the Perceptron § Noise: if the data isn’t separable, weights might thrash § Averaging weight vectors over time can help (averaged perceptron) § Mediocre generalization: finds a “barely” separating solution § Overtraining: test / held-out accuracy usually rises, then falls § Overtraining is a kind of overfitting Non-Separable Case: Deterministic Decision Non-Separable Case: Probabilistic Decision Even the best linear boundary makes at least one mistake 0.9 | 0.1 0.7 | 0.3 0.5 | 0.5 0.3 | 0.7 0.1 | 0.9

How to get probabilistic decisions? Best w? § Perceptron scoring: § Maximum likelihood estimation: § If very positive à want probability going to 1 § If very negative à want probability going to 0 § Sigmoid function with: = Logistic Regression Separable Case: Deterministic Decision – Many Options Separable Case: Probabilistic Decision – Clear Preference 0.7 | 0.3 0.5 | 0.5 0.7 | 0.3 0.3 | 0.7 0.5 | 0.5 0.3 | 0.7

Multiclass Logistic Regression Best w? § Recall Perceptron: § Maximum likelihood estimation: § A weight vector for each class: § Score (activation) of a class y: § Prediction highest score wins § How to make the scores into probabilities? with: = Multi-Class Logistic Regression original activations softmax activations Next Lecture § Optimization § i.e., how do we solve:

Linear Classifiers CS 188: Artificial Intelligence Perceptrons and - PowerPoint PPT Presentation

Linear Classifiers CS 188: Artificial Intelligence Perceptrons and Logistic Regression Pieter Abbeel & Dan Klein University of California, Berkeley Feature Vectors Some (Simplified) Biology Very loose inspiration: human neurons Hello,

Nonlinear Classifiers II 2 Nonlinear Classifiers: Introduction Classifiers Supervised

Artificial Intelligence Artificial Intelligence Artificial Intelligence Study and design of

Outline CS 188: Artificial Intelligence Generative vs. Discriminative Binary Linear

Artificial Intelligence Course Presentation Summary Artificial Intelligence Motivations

Artificial Intelligence Course Presentation Summary Artificial Intelligence Motivations

Linear Classifiers: Expressiveness Machine Learning 1 Lecture outline Linear models:

Reminder: Linear Classifiers CS 188: Artificial Intelligence Optimization and Neural Nets

CS440/ECE448 Lecture 22: Including Slides by Svetlana Lazebnik, 10/2016 Linear Classifiers

Cognitive Modeling Unseen Examples 2 Bayes Classifiers Lecture 14: Naive Bayes Classifiers

Artificial intelligence Artificial Intelligence is the science of PHILOSOPHY OF ARTIFICIAL

Artificial Intelligence Intro (Chapter 1 of AIMA) Summary Artificial Intelligence What is AI?

Linear Classifiers and the Perceptron William Cohen February 4, 2008 1 Linear classifiers

Linear, Binary SVM Classifiers COMPSCI 371D Machine Learning COMPSCI 371D Machine

CS 188: Artificial Intelligence Perceptrons and Logistic Regression Pieter Abbeel & Dan Klein

What is Artificial Intelligence? CPSC 322 Lecture 1 September 5, 2007 What is Artificial

Traditional Definition of Artificial Intelligence Trends Artificial Intelligence (AI) is

compsci 514: algorithms for data science Cameron Musco University of Massachusetts Amherst. Fall

Verifying Bit-vector Invertibility Conditions in Coq Burak Ekici, Arjun Viswanathan, Yoni

Approaches for Angle of Arrival Estimation Wenguang Mao Angle of Arrival (AoA) Definition:

Chapter 3: Logical Time Ajay Kshemkalyani and Mukesh Singhal Distributed Computing: Principles,

4.1 Vector Spaces and Subspaces McDonald Fall 2018, MATH 2210Q, 4.1 Slides 4.1 Homework : Read

Overview Points Vectors Lines Spheres Matrices 3D transformations as

Support vector machines and applications in computational biology Jean-Philippe Vert

Ray Tracing Basics CSE 681 Autumn 11 Han-Wei Shen Forward Ray Tracing We shoot a large

Linear Classifiers CS 188: Artificial Intelligence Perceptrons and - PowerPoint PPT Presentation

Linear Classifiers CS 188: Artificial Intelligence Perceptrons and Logistic Regression Pieter Abbeel & Dan Klein University of California, Berkeley Feature Vectors Some (Simplified) Biology Very loose inspiration: human neurons Hello,

Nonlinear Classifiers II 2 Nonlinear Classifiers: Introduction Classifiers Supervised

Artificial Intelligence Artificial Intelligence Artificial Intelligence Study and design of

Outline CS 188: Artificial Intelligence Generative vs. Discriminative Binary Linear

Artificial Intelligence Course Presentation Summary Artificial Intelligence Motivations

Artificial Intelligence Course Presentation Summary Artificial Intelligence Motivations

Linear Classifiers: Expressiveness Machine Learning 1 Lecture outline Linear models:

Reminder: Linear Classifiers CS 188: Artificial Intelligence Optimization and Neural Nets

CS440/ECE448 Lecture 22: Including Slides by Svetlana Lazebnik, 10/2016 Linear Classifiers

Cognitive Modeling Unseen Examples 2 Bayes Classifiers Lecture 14: Naive Bayes Classifiers

Artificial intelligence Artificial Intelligence is the science of PHILOSOPHY OF ARTIFICIAL

Artificial Intelligence Intro (Chapter 1 of AIMA) Summary Artificial Intelligence What is AI?

Linear Classifiers and the Perceptron William Cohen February 4, 2008 1 Linear classifiers

Linear, Binary SVM Classifiers COMPSCI 371D Machine Learning COMPSCI 371D Machine

CS 188: Artificial Intelligence Perceptrons and Logistic Regression Pieter Abbeel &amp; Dan Klein

What is Artificial Intelligence? CPSC 322 Lecture 1 September 5, 2007 What is Artificial

Traditional Definition of Artificial Intelligence Trends Artificial Intelligence (AI) is

compsci 514: algorithms for data science Cameron Musco University of Massachusetts Amherst. Fall

Verifying Bit-vector Invertibility Conditions in Coq Burak Ekici, Arjun Viswanathan, Yoni

Approaches for Angle of Arrival Estimation Wenguang Mao Angle of Arrival (AoA) Definition:

Chapter 3: Logical Time Ajay Kshemkalyani and Mukesh Singhal Distributed Computing: Principles,

4.1 Vector Spaces and Subspaces McDonald Fall 2018, MATH 2210Q, 4.1 Slides 4.1 Homework : Read

Overview Points Vectors Lines Spheres Matrices 3D transformations as

Support vector machines and applications in computational biology Jean-Philippe Vert

Ray Tracing Basics CSE 681 Autumn 11 Han-Wei Shen Forward Ray Tracing We shoot a large

CS 188: Artificial Intelligence Perceptrons and Logistic Regression Pieter Abbeel & Dan Klein