Machines CMSC 422 M ARINE C ARPUAT marine@cs.umd.edu Slides - - PowerPoint PPT Presentation

machines
SMART_READER_LITE
LIVE PREVIEW

Machines CMSC 422 M ARINE C ARPUAT marine@cs.umd.edu Slides - - PowerPoint PPT Presentation

Support Vector Machines CMSC 422 M ARINE C ARPUAT marine@cs.umd.edu Slides credit: Piyush Rai Back to linear classification Last time: weve seen that kernels can help capture non-linear patterns in data while keeping the advantages of


slide-1
SLIDE 1

Support Vector Machines

CMSC 422 MARINE CARPUAT

marine@cs.umd.edu

Slides credit: Piyush Rai

slide-2
SLIDE 2

Back to linear classification

  • Last time: we’ve seen that kernels can help

capture non-linear patterns in data while keeping the advantages of a linear classifier

  • Today: Support Vector Machines

– A hyperplane-based classification algorithm – Highly influential – Backed by solid theoretical grounding (Vapnik & Cortes, 1995) – Easy to kernelize

slide-3
SLIDE 3

The Maximum Margin Principle

  • Find the hyperplane with maximum

separation margin on the training data

slide-4
SLIDE 4
slide-5
SLIDE 5

Support Vector Machine (SVM)

slide-6
SLIDE 6

Characterizing the margin

Let’s assume the entire training data is correctly classified by (w,b) that achieve the maximum margin

slide-7
SLIDE 7

The Optimization Problem

slide-8
SLIDE 8

Large Margin = Good Generalization

  • Intuitively, large margins mean good

generalization

– Large margin => small ||w|| – small ||w|| => regularized/simple solutions

  • (Learning theory gives a more formal

justification)

slide-9
SLIDE 9

Solving the SVM Optimization Problem

slide-10
SLIDE 10

Solving the SVM Optimization Problem

slide-11
SLIDE 11

Solving the SVM Optimization Problem

slide-12
SLIDE 12

Solving the SVM Optimization Problem

A Quadratic Program for which many off-the-shelf solvers exist

slide-13
SLIDE 13

SVM: the solution!

slide-14
SLIDE 14

What if the data is not separable?

slide-15
SLIDE 15

Support Vector Machines

  • Find the max margin linear classifier for a dataset
  • Discovers “support vectors”, the training

examples that “support” the margin boundaries

  • Allows misclassified training examples
  • Today: we’ve seen how to learn an SVM if the

data is separable

  • Next time: we’ll solve the more general case