Support Vector Machine Machine Learning 10-601B Seyoung - PowerPoint PPT Presentation

Support ¡Vector ¡Machine ¡ Machine ¡Learning ¡10-‑601B ¡ Seyoung ¡Kim ¡ Many ¡of ¡these ¡slides ¡are ¡derived ¡fromTom ¡ Mitchell, ¡Ziv ¡Bar-‑Joseph. ¡Thanks! ¡

Types ¡of ¡classifiers ¡ • We ¡can ¡divide ¡the ¡large ¡variety ¡of ¡classificaGon ¡approaches ¡into ¡roughly ¡three ¡major ¡types ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡1. ¡Instance ¡based ¡classifiers ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡-‑ ¡Use ¡observaGon ¡directly ¡(no ¡models) ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡-‑ ¡e.g. ¡K ¡nearest ¡neighbors ¡ ¡ ¡ ¡ ¡ ¡ ¡2. ¡Classifiers ¡based ¡on ¡generaGve ¡models: ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡-‑ ¡build ¡a ¡generaGve ¡staGsGcal ¡model ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡-‑ ¡e.g., ¡Naïve ¡Bayes ¡classifier, ¡classifiers ¡derived ¡from ¡Bayesian ¡networks ¡ ¡ ¡ ¡ ¡ ¡ ¡3. ¡Classifiers ¡based ¡on ¡discriminaGve ¡models: ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡-‑ ¡directly ¡esGmate ¡a ¡decision ¡rule/boundary ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡-‑ ¡e.g., ¡decision ¡tree, ¡perceptron, ¡logisGc ¡regression ¡

Linear ¡Classifiers ¡ Recall ¡logisGc ¡regression ¡ +1 ¡if ¡sign(w T x+b) ≥ 0 ¡ -‑1 ¡ ¡if ¡sign(w T x+b)<0 ¡

Linear ¡Classifiers ¡ Recall ¡logisGc ¡regression ¡ Line ¡closer ¡to ¡the ¡ blue ¡nodes ¡since ¡ many ¡of ¡them ¡are ¡far ¡ away ¡from ¡the ¡ boundary ¡

Linear ¡Classifiers ¡ Recall ¡logisGc ¡regression ¡ Loss ( y i , w T x i ) ∑ min w i Errors ¡over ¡all ¡samples ¡ Line ¡closer ¡to ¡the ¡ blue ¡nodes ¡since ¡ many ¡of ¡them ¡are ¡far ¡ away ¡from ¡the ¡ boundary ¡

Linear ¡Classifiers ¡ Recall ¡logisGc ¡regression ¡ Loss ( y i , w T x i ) ∑ min w Many ¡more ¡possible ¡ i classifiers ¡ Errors ¡over ¡all ¡samples ¡ Line ¡closer ¡to ¡the ¡ blue ¡nodes ¡since ¡ many ¡of ¡them ¡are ¡far ¡ away ¡from ¡the ¡ boundary ¡

Max ¡margin ¡classifiers ¡ • Instead ¡of ¡fi\ng ¡all ¡points, ¡focus ¡on ¡boundary ¡points ¡ • Learn ¡a ¡boundary ¡that ¡leads ¡to ¡the ¡largest ¡margin ¡from ¡both ¡ sets ¡of ¡points ¡ From ¡all ¡the ¡possible ¡ boundary ¡lines, ¡this ¡ leads ¡to ¡the ¡largest ¡ margin ¡on ¡both ¡sides ¡

Max ¡margin ¡classifiers ¡ • Instead ¡of ¡fi\ng ¡all ¡points, ¡focus ¡on ¡boundary ¡points ¡ • Learn ¡a ¡boundary ¡that ¡leads ¡to ¡the ¡largest ¡margin ¡from ¡both ¡ sets ¡of ¡points ¡ Why? ¡ ¡ D ¡ • ¡IntuiGve, ¡‘makes ¡sense’ ¡ D ¡ • ¡Some ¡theoreGcal ¡ support ¡ • ¡Works ¡well ¡in ¡pracGce ¡

Max ¡margin ¡classifiers ¡ • Instead ¡of ¡fi\ng ¡all ¡points, ¡focus ¡on ¡boundary ¡points ¡ • Learn ¡a ¡boundary ¡that ¡leads ¡to ¡the ¡largest ¡margin ¡from ¡both ¡ sets ¡of ¡points ¡ Also ¡known ¡as ¡linear ¡ D ¡ support ¡vector ¡ machines ¡(SVMs) ¡ D ¡ These ¡are ¡the ¡vectors ¡ supporGng ¡the ¡boundary ¡

Specifying ¡a ¡max ¡margin ¡classifier ¡ Class ¡+1 ¡plane ¡ boundary ¡ Class ¡-‑1 ¡plane ¡ Classify ¡as ¡+1 ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡if ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡w T x+b ¡ ≥ ¡1 ¡ Classify ¡as ¡-‑1 ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡if ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡w T x+b ¡ ≤ ¡-‑ ¡1 ¡ Undefined ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡if ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡-‑1 ¡<w T x+b ¡< ¡1 ¡

Specifying ¡a ¡max ¡margin ¡classifier ¡ Is ¡the ¡linear ¡separaGon ¡ assumpGon ¡realisGc? ¡ ¡ We ¡will ¡deal ¡with ¡this ¡shortly, ¡ but ¡let’s ¡assume ¡for ¡now ¡data ¡ are ¡linearly ¡separable ¡ Classify ¡as ¡+1 ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡if ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡w T x+b ¡ ≥ ¡1 ¡ Classify ¡as ¡-‑1 ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡if ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡w T x+b ¡ ≤ ¡-‑ ¡1 ¡ Undefined ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡if ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡-‑1 ¡<w T x+b ¡< ¡1 ¡

Maximizing ¡the ¡margin ¡ Classify ¡as ¡+1 ¡ ¡ ¡if ¡ ¡ ¡w T x+b ¡ ≥ ¡1 ¡ Classify ¡as ¡-‑1 ¡ ¡ ¡ ¡if ¡ ¡ ¡w T x+b ¡ ≤ ¡-‑ ¡1 ¡ Undefined ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡if ¡ ¡ ¡-‑1 ¡<w T x+b ¡< ¡1 ¡ • ¡Let’s ¡define ¡the ¡width ¡of ¡the ¡margin ¡as ¡M ¡ • ¡How ¡can ¡we ¡encode ¡our ¡goal ¡of ¡maximizing ¡M ¡in ¡terms ¡of ¡our ¡ parameters ¡(w ¡and ¡b)? ¡ • ¡Let’s ¡start ¡with ¡a ¡few ¡obsevraGons ¡

Maximizing ¡the ¡margin ¡ Classify ¡as ¡+1 ¡ ¡ ¡if ¡ ¡ ¡w T x+b ¡ ≥ ¡1 ¡ Classify ¡as ¡-‑1 ¡ ¡ ¡ ¡if ¡ ¡ ¡w T x+b ¡ ≤ ¡-‑ ¡1 ¡ Undefined ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡if ¡ ¡ ¡-‑1 ¡<w T x+b ¡< ¡1 ¡ • ¡ObservaGon ¡1: ¡the ¡vector ¡w ¡is ¡orthogonal ¡to ¡the ¡+1 ¡plane ¡ • ¡Why? ¡ Let ¡u ¡and ¡v ¡be ¡two ¡points ¡on ¡the ¡+1 ¡plane, ¡then ¡for ¡ the ¡vector ¡defined ¡by ¡u ¡and ¡v ¡we ¡have ¡w T (u-‑v) ¡= ¡0 ¡ ¡ Corollary: ¡the ¡vector ¡w ¡is ¡orthogonal ¡to ¡the ¡-‑1 ¡plane ¡ ¡

Maximizing ¡the ¡margin ¡ Classify ¡as ¡+1 ¡ ¡ ¡if ¡ ¡ ¡w T x+b ¡ ≥ ¡1 ¡ Classify ¡as ¡-‑1 ¡ ¡ ¡ ¡if ¡ ¡ ¡w T x+b ¡ ≤ ¡-‑ ¡1 ¡ Undefined ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡if ¡ ¡ ¡-‑1 ¡<w T x+b ¡< ¡1 ¡ • ¡ObservaGon ¡1: ¡the ¡vector ¡w ¡is ¡orthogonal ¡to ¡the ¡+1 ¡and ¡-‑1 ¡planes ¡ • ¡ObservaGon ¡2: ¡if ¡x + ¡is ¡a ¡point ¡on ¡the ¡+1 ¡plane ¡and ¡x -‑ ¡is ¡the ¡closest ¡point ¡to ¡x + ¡ on ¡the ¡-‑1 ¡plane ¡then ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡x + ¡= ¡ λ w ¡+ ¡x -‑ ¡ Since ¡w ¡is ¡orthogonal ¡to ¡both ¡planes ¡we ¡ need ¡to ¡‘travel’ ¡some ¡distance ¡along ¡w ¡to ¡ get ¡from ¡x + ¡ ¡to ¡x -‑ ¡ ¡

Pu>ng ¡it ¡together ¡ w T ¡x + ¡+ ¡b ¡= ¡+1 ¡ ⇒ ¡w T ¡( λ w ¡+ ¡x -‑ ) ¡+ ¡b ¡= ¡+1 ¡ • ¡w T ¡x + ¡+ ¡b ¡= ¡+1 ¡ ⇒ ¡w T x -‑ ¡+ ¡b ¡ ¡+ ¡ λ w T w ¡= ¡+1 ¡ • ¡w T ¡x -‑ ¡+ ¡b ¡= ¡-‑1 ¡ • ¡x + ¡= ¡ λ w ¡+ ¡x -‑ ¡ ¡ ¡-‑1 ¡ ¡+ ¡ λ w T w ¡= ¡+1 ¡ ⇒ ¡| ¡x + ¡-‑ ¡x -‑ ¡| ¡= ¡M ¡ • ⇒ ¡ λ ¡= ¡2/w T w ¡ We ¡can ¡now ¡define ¡M ¡in ¡ terms ¡of ¡w ¡and ¡b ¡

Pu>ng ¡it ¡together ¡ M ¡= ¡|x + ¡-‑ ¡x -‑ | ¡ ⇒ ¡ ⇒ ¡ • ¡w T ¡x + ¡+ ¡b ¡= ¡+1 ¡ • ¡w T ¡x -‑ ¡+ ¡b ¡= ¡-‑1 ¡ • ¡x + ¡= ¡ λ w ¡+ ¡x -‑ ¡ ¡ • ¡| ¡x + ¡-‑ ¡x -‑ ¡| ¡= ¡M ¡ ¡ λ ¡= ¡2/w T w ¡ • We ¡can ¡now ¡define ¡M ¡in ¡ terms ¡of ¡w ¡and ¡b ¡

Finding ¡the ¡opAmal ¡parameters ¡ We ¡can ¡now ¡search ¡for ¡the ¡opGmal ¡parameters ¡by ¡finding ¡a ¡soluGon ¡ that: ¡ 1. Correctly ¡classifies ¡all ¡points ¡ 2. Maximizes ¡the ¡margin ¡(or ¡equivalently ¡minimizes ¡w T w) ¡

Support Vector Machine Machine Learning 10-601B Seyoung - PowerPoint PPT Presentation

Support Vector Machine Machine Learning 10-601B Seyoung Kim Many of these slides are derived fromTom Mitchell, Ziv Bar-Joseph. Thanks! Types of

Vector addition: The zero vector The D -vector whose entries are all zero is the zero vector ,

Support Vector Machines Preview What is a support vector machine? The perceptron revisited

Why Deep Learning Is More Natural Questions Efficient than Support Support Vector . . . Support

Multi-class Support Vector Machine Rizal Zaini Ahmad Fathony November 10, 2016 University of

Matrix and Vector Operations Matrix and Vector Operations 1 / 21 Matrix and Vector Operations

Day 3 Advanced Vector Architectures Session A: Vector Instruction Execution Pipelines Break

Support Vector Machines October 16, 2018 Support Vector Machines October 16, 2018 1 / 31

? 17.10.2018 3 17.10.2018 4 Support Vector Machines (SVM): Background Support Vector Machines

Support Vector Machine w T x + b = 0 b || w || Support Vector Support Vector w X i y i ( x

Support Vector Machines This set of notes presents the Support Vector Machine (SVM) learning al-

Machine Learning for NLP Support Vector Machines Aurlie Herbelot 2019 Centre for Mind/Brain

What is a What are Support Vector Machines Support Vector Machine? Used For? An optimally

Lecture 6: Support Vector Machine (Part 1) Feb 10 2020 Lecturer: Steven Wu Scribe: Steven Wu We

Support Vector Machines: Training with Stochastic Gradient Descent Machine Learning 1 Support

Classifiers: Support Vector Machine 1 MACHINE LEARNING What is Classification? Female Adult

Kernel Methods for Regression Support Vector Regression Gaussian Mixture Regression Gaussian

r rt r t r

Local Density of States in 1D Mott Insulators with a Boundary F.H.L. Essler (Oxford) D.

Conformal anomaly, entanglement entropy and boundaries Sergey N. Solodukhin University of Tours

Canadi dian B n Bounda undary Commi missions ns a as a Mode del f for U.S .S. R . Redis

GSP Stakeholder Committee Stakeholder Committee Meeting July 23, 2018 Agenda Welcome,

PR PROACTIVE CTIVE SE SECUR CURITY ITY: : DATA A BREA BREACH CH ASSE ASSESSM SSMENT

1 Data Mining and Privacy The primary task in data mining: Develop models about aggregated

CSC 495.002 Group Projects Dr. Ozg ur Kafal North Carolina State University