Support Vector Machines COMP 640 Ryan Spring, Sarah Kim - PowerPoint PPT Presentation

Support ¡Vector ¡Machines ¡ COMP ¡640 ¡ Ryan ¡Spring, ¡Sarah ¡Kim ¡

Quiz ¡Example ¡SoluBons ¡

What ¡is ¡classificaBon? ¡ F(x) ¡= ¡-‑1 ¡Not ¡Spam ¡ F(x) ¡= ¡+1 ¡Spam ¡

How ¡should ¡I ¡divide ¡the ¡data? ¡

Linear ¡Classifier ¡ Y = F ( w T x ) = ∑ w i x j

MulBple ¡Possible ¡SoluBons ¡

Defining ¡Features ¡of ¡SVM ¡

How ¡SVM ¡works ¡ w

Unknown ¡Data ¡(1) ¡ u w

Unknown ¡Data ¡(2) ¡ ProjecBon ¡of ¡unknown ¡item ¡ u ¡onto ¡vector ¡ w ¡that ¡is ¡ perpendicular ¡to ¡the ¡ hyperplane ¡ u U*w ¡ w

SVM ¡Decision ¡Rule ¡ 𝑥 ∙ 𝑣 + 𝑐 ≥0 ¡then ¡ X ↓ + ¡ 𝑥 ∙ 𝑣 + 𝑐 <0 ¡then ¡ X ↓ − ¡ u U*w ¡ w

Learning ¡SVM-‑Minimizing ¡w ¡ 𝑋𝑗𝑒𝑢ℎ = (𝑦↓ + − 𝑦↓ − ) ∙ 𝑥/‖𝑥‖ ¡ Constraints: ¡ 𝑥/‖𝑥‖ ¡ 𝑧↓𝑗 (𝑦↓𝑗 𝑥 + 𝑐) −1=0 ¡ 𝑦↓ + − 𝑦↓ − ¡

Learning ¡SVM-‑Minimizing ¡w ¡ 𝑋𝑗𝑒𝑢ℎ = (𝑦↓ + − 𝑦↓ − ) ∙ 𝑥/‖𝑥‖ ¡ 1+ 𝑐 ¡ 1− 𝑐 ¡ Constraints: ¡ 𝑥/‖𝑥‖ ¡ 𝑧↓𝑗 (𝑦↓𝑗 𝑥 + 𝑐) −1=0 ¡ 𝑦↓ + − 𝑦↓ − ¡

Learning ¡SVM-‑Minimizing ¡w ¡ 𝑋𝑗𝑒𝑢ℎ = (𝑦↓ + − 𝑦↓ − ) ∙ 𝑥/‖𝑥‖ = 2 /‖𝑥‖ ¡ 1+ 𝑐 ¡ 1− 𝑐 ¡ Constraints: ¡ 𝑥/‖𝑥‖ ¡ 𝑧↓𝑗 (𝑦↓𝑗 𝑥 + 𝑐) −1=0 ¡ 𝑦↓ + − 𝑦↓ − ¡

Learning ¡SVM ¡– ¡Minimizing ¡w ¡ Distance ¡between ¡projecBons ¡of ¡training ¡data: ¡ x ⋅ w x ⋅ w min max p ( w , b ) = | w | − | w | { x : y = 1} { x : y = − 1} When ¡maximizing ¡this ¡distance: ¡ 2 2 p ( w 0 , b 0 ) = | w 0 | = Minimize ¡this ¡ w 0 ⋅ w 0

Learning ¡SVM ¡– ¡Penalizing ¡ misclassificaBon ¡ ¡ Hinge ¡Loss ¡FuncBon ¡ N ∑ C max(0,1 − y i f ( x i )) i

Primal ¡Form ¡ f ( x ) = w T x + b ß ¡Classifier ¡ For ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡: ¡ ¡ w ∈ ℜ d N w ∈ℜ d || w || 2 + C ∑ min max(0,1 − y i f ( x i )) i Minimize ¡w ¡ Penalizing ¡misclassificaBon ¡ Maximize ¡margin ¡ (Hinge ¡Loss) ¡

Challenges ¡ I. ¡Handling ¡error ¡(slack ¡vars.) ¡ 2. ¡Handling ¡non-‑linearly ¡separable ¡data ¡(kernels) ¡ ¡

1. ¡Handling ¡Error ¡-‑ ¡Slack ¡Variables ¡ ξ i ≥ 0 All ¡data ¡points ¡ 0 < ξ ≤ 1 X ¡ Inside ¡the ¡margin ¡ X ¡ ξ > 1 Misclassified ¡

Slack ¡FormulaBon ¡ N w ∈ℜ , ξ i ∈ℜ + || w || 2 + C ∑ min ξ i i Subject ¡to ¡ ¡ y i ( w T x i + b ) ≥ 1 − ξ i For ¡i ¡= ¡1…N ¡

2. ¡Non-‑Linear ¡SeparaBon ¡-‑ ¡Dual ¡Form ¡ SoluBon ¡w ¡can ¡be ¡wriaen ¡as ¡linear ¡combo ¡of ¡training ¡data: ¡ N ∑ w = a j y j x j j = 1 SubsBtute ¡w ¡in ¡primal ¡classifier ¡ ¡ ¡ f ( x ) = w T x + b T " % N N ∑ ∑ T x ) + b f ( x ) = $ α j y j x j ' x + b = y i ( x i α i $ ' # & j = 1 i

Dual ¡Form ¡Problem ¡ For ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡: ¡ ¡ ¡ w ∈ ℜ N − 1 ∑ ∑ T x k ) max α k y j y k ( x j α i α j 2 α i ≥ 0 i jk Subject ¡to ¡0 ¡≤ ¡α ¡≤ ¡C ¡for ¡ ¡ ¡ ¡ ¡, ¡and ¡ ¡ ∑ α i y i = 0 ∀ i i

Kernel ¡Trick ¡ N ∑ T x ) + b Dual ¡Form ¡Classifier: ¡ f ( x ) = y i ( x i α i ¡ i ¡ N T z ) k ( x i , x ) = ( x i ∑ Kernel ¡Classifier: ¡ f ( x ) = y i k ( x i , x ) + b α i ¡ i Knowledge ¡of ¡inner ¡product ¡is ¡key ¡

Example: ¡Polynomial ¡Kernel ¡ k ( x , x ') = (1 + x T x ') 2 + ¡ + ¡ + ¡ + ¡ + ¡ + ¡ -‑ ¡ -‑ ¡ -‑ ¡ -‑ ¡ -‑ ¡

Experiments ¡-‑ ¡Classifying ¡Numbers ¡ ¡ • Postal ¡(16x16 ¡pxls): ¡7,300 ¡training, ¡2,000 ¡test ¡ ¡ • NIST ¡(28x28 ¡pxls): ¡60,000 ¡training, ¡10,000 ¡test ¡

Error ¡remains ¡constant ¡with ¡ increasing ¡feature ¡space ¡size ¡ Training ¡ Bme? ¡

Comparison ¡with ¡other ¡classifiers ¡

Advantages ¡over ¡Neural ¡Net ¡and ¡kNN ¡ • Neural Net – Global optimum not guaranteed • Non-convex cost function – Several parameters require tuning • kNN ¡ – Curse ¡of ¡dimensionality ¡

Conclusions ¡about ¡SVM ¡ • OpBmal ¡hyperplane ¡for ¡classificaBon ¡ ¡ • Universal ¡learning ¡machine ¡ – Slack ¡variables ¡(error) ¡ – Kernels ¡(non-‑linear ¡separaBon) ¡ ¡ • Knowledge ¡of ¡inner ¡products ¡is ¡key ¡ ¡

Other ¡Resources ¡ • Andrew ¡Zisserman’s ¡lectures ¡ – hap://www.robots.ox.ac.uk/~az/lectures/ml/ lect2.pdf ¡ – hap://www.robots.ox.ac.uk/~az/lectures/ml/ lect3.pdf ¡ • MIT ¡AI ¡Course ¡Video ¡ – haps://www.youtube.com/watch? v=_PwhiWxHK8o ¡

Support Vector Machines COMP 640 Ryan Spring, Sarah Kim - PowerPoint PPT Presentation

Support Vector Machines COMP 640 Ryan Spring, Sarah Kim Quiz Example SoluBons What is classificaBon? F(x) = -1 Not Spam F(x) = +1 Spam

Kernel Machines Support Vector Machines 1 Kernel Machines Optimal Separating HyperPlanes Soft

Kernel Machines Steven J Zeil Old Dominion Univ. Fall 2010 1 Support Vector Machines Kernel

? 17.10.2018 3 17.10.2018 4 Support Vector Machines (SVM): Background Support Vector Machines

Support Vector Machines October 16, 2018 Support Vector Machines October 16, 2018 1 / 31

Relevance Vector Machines Jukka Lankinen LUT February 21, 2011 Jukka Lankinen Relevance Vector

Vector addition: The zero vector The D -vector whose entries are all zero is the zero vector ,

Support Vector Machines & Kernelization Barna Saha Most of the slides are made using David

Introduction Kailash Awati Instructor DataCamp Support Vector Machines in R Preliminaries

Support Vector Machines Support Vector Machines Hypothesis Space Hypothesis Space variable

Support Vector Machines (Ch. 18.9) SVM Basics Support Vector Machines (SVMs) try to do our

Support vector machines CS 446 Part 1: linear support vector machines 1.0 1.0 1.0 0.8 0.8

SUPPORT VECTOR MACHINES SUPPORT VECTOR MACHINES Matthieu R Bloch Tuesday, February 25, 2020 1

RBF Kernels: Generating a complex dataset DataCamp Support Vector Machines in R A bit about RBF

Machine Learning for NLP Support Vector Machines Aurlie Herbelot 2019 Centre for Mind/Brain

Generating a radially separable dataset DataCamp Support Vector Machines in R Generating a 2d

Support Vector Machines 290N, 2014 Support Vector Machines (SVM) Supervised learning

CS480/680 Lecture 14: June 24, 2019 Support Vector Machines (continued) [B] Sec. 7.1 [D] Sec.

Chapter 18 Linear Programming CS 573: Algorithms, Fall 2013 October 29, 2013 18.1 Linear

3D GRAPHICS design animate render Computer Graphics 3D animation movies Computer Graphics

Linear Block Codes Saravanan Vijayakumaran sarva@ee.iitb.ac.in Department of Electrical

In SMV I IAML: Support Vector Machines II We saw: Max margin trick Nigel Goddard

CS257 Linear and Convex Optimization Lecture 7 Bo Jiang John Hopcroft Center for Computer

Data Mining and Machine Learning: Fundamental Concepts and Algorithms dataminingbook.info

CS473 CS-473 Text Categorization (II) Luo Si Department of Computer Science Purdue University

Support Vector Machines COMP 640 Ryan Spring, Sarah Kim - PowerPoint PPT Presentation

Support Vector Machines COMP 640 Ryan Spring, Sarah Kim Quiz Example SoluBons What is classificaBon? F(x) = -1 Not Spam F(x) = +1 Spam

Kernel Machines Support Vector Machines 1 Kernel Machines Optimal Separating HyperPlanes Soft

Kernel Machines Steven J Zeil Old Dominion Univ. Fall 2010 1 Support Vector Machines Kernel

? 17.10.2018 3 17.10.2018 4 Support Vector Machines (SVM): Background Support Vector Machines

Support Vector Machines October 16, 2018 Support Vector Machines October 16, 2018 1 / 31

Relevance Vector Machines Jukka Lankinen LUT February 21, 2011 Jukka Lankinen Relevance Vector

Vector addition: The zero vector The D -vector whose entries are all zero is the zero vector ,

Support Vector Machines &amp; Kernelization Barna Saha Most of the slides are made using David

Introduction Kailash Awati Instructor DataCamp Support Vector Machines in R Preliminaries

Support Vector Machines Support Vector Machines Hypothesis Space Hypothesis Space variable

Support Vector Machines (Ch. 18.9) SVM Basics Support Vector Machines (SVMs) try to do our

Support vector machines CS 446 Part 1: linear support vector machines 1.0 1.0 1.0 0.8 0.8

SUPPORT VECTOR MACHINES SUPPORT VECTOR MACHINES Matthieu R Bloch Tuesday, February 25, 2020 1

RBF Kernels: Generating a complex dataset DataCamp Support Vector Machines in R A bit about RBF

Machine Learning for NLP Support Vector Machines Aurlie Herbelot 2019 Centre for Mind/Brain

Generating a radially separable dataset DataCamp Support Vector Machines in R Generating a 2d

Support Vector Machines 290N, 2014 Support Vector Machines (SVM) Supervised learning

CS480/680 Lecture 14: June 24, 2019 Support Vector Machines (continued) [B] Sec. 7.1 [D] Sec.

Chapter 18 Linear Programming CS 573: Algorithms, Fall 2013 October 29, 2013 18.1 Linear

3D GRAPHICS design animate render Computer Graphics 3D animation movies Computer Graphics

Linear Block Codes Saravanan Vijayakumaran sarva@ee.iitb.ac.in Department of Electrical

In SMV I IAML: Support Vector Machines II We saw: Max margin trick Nigel Goddard

CS257 Linear and Convex Optimization Lecture 7 Bo Jiang John Hopcroft Center for Computer

Data Mining and Machine Learning: Fundamental Concepts and Algorithms dataminingbook.info

CS473 CS-473 Text Categorization (II) Luo Si Department of Computer Science Purdue University

Support Vector Machines & Kernelization Barna Saha Most of the slides are made using David