Announcements - Homework Homework 1 is graded, please collect at end - PowerPoint PPT Presentation

Announcements - Homework • Homework 1 is graded, please collect at end of lecture • Homework 2 due today • Homework 3 out soon (watch email) • Ques 1 – midterm review

HW1 score distribution HW1 total score 40 35 30 25 20 15 10 5 0 0~10 10~20 20~30 30~40 40~50 50~60 60~70 70~80 80~90 90~100 100~110 2

Announcements - Midterm • When: Wednesday, 10/20 • Where: In Class • What: You, your pencil, your textbook, your notes, course slides, your calculator, your good mood :) • What NOT: No computers, iphones, or anything else that has an internet connection. • Material: Everything from the beginning of the semester, until, and including SVMs and the Kernel trick 3

Recitation Tomorrow! • Boosting, SVM (convex optimization), Midterm review ! • Strongly recommended!! • Place: NSH 3305 ( Note: change from last time ) • Time: 5-6 pm Rob

Support Vector Machines Aarti Singh Machine Learning 10-701/15-781 Oct 13, 2010

At Pittsburgh G- 20 summit … 6

Linear classifiers – which line is better? 7

Pick the one with the largest margin! 8

Parameterizing the decision boundary w . x =  j w (j) x (j) w . x + b < 0 w . x + b > 0 Example i (= 1,2,…,n): Data: 9

Parameterizing the decision boundary w . x + b < 0 w . x + b > 0 10

Maximizing the margin w . x + b > 0 w . x + b < 0 Distance of closest examples from the line/hyperplane margin = g = 2a/ ǁwǁ g g 11

Maximizing the margin w . x + b > 0 w . x + b < 0 Distance of closest examples from the line/hyperplane margin = g = 2a/ ǁwǁ max g = 2a/ ǁwǁ g g w , b s.t. ( w . x j + b ) y j ≥ a  j Note: ‘a’ is arbitrary (can normalize equations by a) 12

Support Vector Machines w . x + b > 0 w . x + b < 0 min w . w w , b s.t. ( w . x j + b ) y j ≥ 1  j Solve efficiently by quadratic programming (QP) g – Well-studied solution g algorithms Linear hyperplane defined by “ support vectors ” 13

Support Vectors w . x + b > 0 w . x + b < 0 Linear hyperplane defined by “ support vectors ” Moving other points a little doesn’t effect the decision boundary only need to store the support vectors to predict g labels of new points g How many support vectors in linearly separable case? ≤ m+1 14

What if data is not linearly separable? Use features of features of features of features…. 2 , x 2 2 , x 1 x 2 , …., exp(x 1 ) x 1 But run risk of overfitting! 15

What if data is still not linearly separable? Allow “error” in classification min w . w + C #mistakes w , b s.t. ( w . x j + b ) y j ≥ 1  j Maximize margin and minimize # mistakes on training data C - tradeoff parameter Not QP  0/1 loss (doesn’t distinguish between near miss and bad mistake) 16

What if data is still not linearly separable? Allow “error” in classification min w . w + C Σξ j j w , b s.t. ( w . x j + b ) y j ≥ 1 - ξ j  j  j ξ j ≥ 0 ξ j - “slack” variables = (>1 if x j misclassifed) pay linear penalty if mistake C - tradeoff parameter (chosen by Soft margin approach cross-validation) Still QP  17

Slack variables – Hinge loss Complexity penalization min w . w + C Σξ j j w , b s.t. ( w . x j + b ) y j ≥ 1 - ξ j  j  j ξ j ≥ 0 Hinge loss 0-1 loss 0 1 -1 18

SVM vs. Logistic Regression SVM : Hinge loss Logistic Regression : Log loss ( -ve log conditional likelihood) Log loss Hinge loss 0-1 loss -1 0 1 19

What about multiple classes? 20

One against all Learn 3 classifiers separately: Class k vs. rest ( w k , b k ) k=1,2,3 y = arg max w k .x + b k k But w k s may not be based on the same scale. Note: (a w) .x + (ab) is also a solution 21

Learn 1 classifier: Multi-class SVM Simultaneously learn 3 sets of weights Margin - gap between correct class and nearest other class y = arg max w (k) .x + b (k) 22

Learn 1 classifier: Multi-class SVM Simultaneously learn 3 sets of weights y = arg max w (k) .x + b (k) Joint optimization: w k s have the same scale. 23

What you need to know • Maximizing margin • Derivation of SVM formulation • Slack variables and hinge loss • Relationship between SVMs and logistic regression – 0/1 loss – Hinge loss – Log loss • Tackling multiple class – One against All – Multiclass SVMs 24

SVMs reminder Regularization Hinge loss min w . w + C Σξ j w , b s.t. ( w . x j + b ) y j ≥ 1 - ξ j  j  j ξ j ≥ 0 Soft margin approach 25

Today’s Lecture • Learn one of the most interesting and exciting recent advancements in machine learning – The “kernel trick” – High dimensional feature spaces at no extra cost! • But first, a detour – Constrained optimization! 26

Constrained Optimization 27

Lagrange Multiplier – Dual Variables Moving the constraint to objective function Lagrangian: Solve: Constraint is tight when a > 0 28

Duality Primal problem: Dual problem: Weak duality – For all feasible points Strong duality – (holds under KKT conditions) 29

Lagrange Multiplier – Dual Variables b +ve b -ve Solving: When a > 0, constraint is tight 30

Announcements - Homework Homework 1 is graded, please collect at end - PowerPoint PPT Presentation

Announcements - Homework Homework 1 is graded, please collect at end of lecture Homework 2 due today Homework 3 out soon (watch email) Ques 1 midterm review HW1 score distribution HW1 total score 40 35 30 25 20 15 10 5

Homework and Exams Homework Context Free Languages Return Homework #2 Homework #3

Homework Homework Context Free Languages Return Homework #2 Homework #3 Due today

Homework Homework #1 returned today Kleene Theorem Homework #2 due today Homework

61A Lecture 35 Wednesday, December 4 Announcements 2 Announcements Homework 11 due Thursday

61A Lecture 6 Monday, February 2 Announcements 2 Announcements Homework 2 due Monday 2/2 @

61A Lecture 6 Friday, September 13 Announcements 2 Announcements Homework 2 due Tuesday

Homework Homework #5 returned Turing Machines Homework #6 due today Homework #7

Homework Homework #2 returned Context Free Languages Homework #3 returned today (for early

Homework Homework #3 returned Chomsky Normal Form Homework #4 due today Homework #5

Homework Homework #2 returned Context Free Languages Homework #3 due today Homework #4

61A Lecture 33 Monday, November 25 Announcements 2 Announcements Homework 10 due Tuesday

61A Lecture 24 Monday, March 30 Announcements 2 Announcements Homework 7 due Wednesday 4/8

61A Lecture 37 Wednesday, April 29 Announcements 2 Announcements Homework 9 (4 pts) due

Math 3B: Lecture 23 Noah White November 16, 2016 Announcements Homework is due this Friday

61A Lecture 13 Wednesday, October 2 Announcements 2 Announcements Homework 3 deadline

61A Lecture 24 Friday, November 1 Announcements 2 Announcements Homework 7 due Tuesday 11/5

Informed search algorithms & Hill-climbing & Simulated annealing Chapter 4 Chapter 4 1

CSC421 Intro to Artificial Intelligence UNIT 04: Local Search Review Heuristic functions

First Things First This is 4003-590-04 / 4005-769-04 (Computer Animation Algorithms &

Today Iterative improvement algorithms See Russell and Norvig, chapters 4 & 5 In many

Random volumes from matrices Sotaro Sugishita (Kyoto Univ.) based on works [1] JHEP1507 (2015)

in Radiative Transfer Models for Data Assimilation: an Evaluation of Satellite-derived Emissivity

SOCIAL DYNAMICS OF ECONOMIC SOCIAL DYNAMICS OF ECONOMIC PERFORMANCE PERFORMANCE INNOVATION AND

Pilleriine Kamenjuk Anto Aasa Importance of the subject Widen the concept of migration and

Announcements - Homework Homework 1 is graded, please collect at end - PowerPoint PPT Presentation

Announcements - Homework Homework 1 is graded, please collect at end of lecture Homework 2 due today Homework 3 out soon (watch email) Ques 1 midterm review HW1 score distribution HW1 total score 40 35 30 25 20 15 10 5

Homework and Exams Homework Context Free Languages Return Homework #2 Homework #3

Homework Homework Context Free Languages Return Homework #2 Homework #3 Due today

Homework Homework #1 returned today Kleene Theorem Homework #2 due today Homework

61A Lecture 35 Wednesday, December 4 Announcements 2 Announcements Homework 11 due Thursday

61A Lecture 6 Monday, February 2 Announcements 2 Announcements Homework 2 due Monday 2/2 @

61A Lecture 6 Friday, September 13 Announcements 2 Announcements Homework 2 due Tuesday

Homework Homework #5 returned Turing Machines Homework #6 due today Homework #7

Homework Homework #2 returned Context Free Languages Homework #3 returned today (for early

Homework Homework #3 returned Chomsky Normal Form Homework #4 due today Homework #5

Homework Homework #2 returned Context Free Languages Homework #3 due today Homework #4

61A Lecture 33 Monday, November 25 Announcements 2 Announcements Homework 10 due Tuesday

61A Lecture 24 Monday, March 30 Announcements 2 Announcements Homework 7 due Wednesday 4/8

61A Lecture 37 Wednesday, April 29 Announcements 2 Announcements Homework 9 (4 pts) due

Math 3B: Lecture 23 Noah White November 16, 2016 Announcements Homework is due this Friday

61A Lecture 13 Wednesday, October 2 Announcements 2 Announcements Homework 3 deadline

61A Lecture 24 Friday, November 1 Announcements 2 Announcements Homework 7 due Tuesday 11/5

Informed search algorithms &amp; Hill-climbing &amp; Simulated annealing Chapter 4 Chapter 4 1

CSC421 Intro to Artificial Intelligence UNIT 04: Local Search Review Heuristic functions

First Things First This is 4003-590-04 / 4005-769-04 (Computer Animation Algorithms &amp;

Today Iterative improvement algorithms See Russell and Norvig, chapters 4 &amp; 5 In many

Random volumes from matrices Sotaro Sugishita (Kyoto Univ.) based on works [1] JHEP1507 (2015)

in Radiative Transfer Models for Data Assimilation: an Evaluation of Satellite-derived Emissivity

SOCIAL DYNAMICS OF ECONOMIC SOCIAL DYNAMICS OF ECONOMIC PERFORMANCE PERFORMANCE INNOVATION AND

Pilleriine Kamenjuk Anto Aasa Importance of the subject Widen the concept of migration and

Informed search algorithms & Hill-climbing & Simulated annealing Chapter 4 Chapter 4 1

First Things First This is 4003-590-04 / 4005-769-04 (Computer Animation Algorithms &

Today Iterative improvement algorithms See Russell and Norvig, chapters 4 & 5 In many