Training Linear SVMs By - Thorsten Joachims Prasad Seemakurthi - - PowerPoint PPT Presentation

training linear svms
SMART_READER_LITE
LIVE PREVIEW

Training Linear SVMs By - Thorsten Joachims Prasad Seemakurthi - - PowerPoint PPT Presentation

Training Linear SVMs By - Thorsten Joachims Prasad Seemakurthi Agenda What is SVM Kernel Hard Margins Soft Margins Linear Algorithm Few Examples Conclusion SVM Curtain Rais iser Linear Classification


slide-1
SLIDE 1

Training Linear SVMs’

By - Thorsten Joachims Prasad Seemakurthi

slide-2
SLIDE 2

Agenda

  • What is SVM
  • Kernel
  • Hard Margins
  • Soft Margins
  • Linear Algorithm
  • Few Examples
  • Conclusion
slide-3
SLIDE 3

SVM – Curtain Rais iser

  • Linear Classification Algorithm
  • SVM have a clever way to prevent
  • ver-fitting
  • SVMs have a very clever way to use

huge number of features nearly as much as computation as seems to be necessary

slide-4
SLIDE 4

Lin inear Cla lassifiers (I (Intuition)

Y (est)

slide-5
SLIDE 5

Lin inear Cla lassifiers

Any of these would be fine … But which is best … ?

denotes +1 denotes -1

Y (est)

slide-6
SLIDE 6

Linear Classifier

slide-7
SLIDE 7

Maximum Margin

denotes +1 denotes -1

Support Vectors

  • 1. Maximizing margin is good

according to intuition and PAC theory

  • 2. Implies only support vectors

are important

  • 3. Empirically works well

Classifier with the maximum margin This kind of simplest kind of SVM is called Linear SVM

slide-8
SLIDE 8

Maximizing th the margin

.

slide-9
SLIDE 9

Why maximize th the margin?

Points near decision surface -----> Uncertain classification decision (50% either way) A classifier with a large margin make no low classification decision Gives classification safety margin w.r.t slight errors in measurement

slide-10
SLIDE 10

Why maximize th the margin?

  • SVM Classifier : Large Margin around

Decision boundary

  • Compare to decision hyperplane: Place a

fat separator between classes

  • Fewer choices of where it can be put
  • Decreased memory capacity
  • Increased ability to correctly

generalize the test data

slide-11
SLIDE 11

Lin inear SVM math themati tically

slide-12
SLIDE 12

Lin inear SVM math themati tically

slide-13
SLIDE 13

Lin inear (h (hard – Margin ) ) SVM – fo formulation

slide-14
SLIDE 14

Solv lving th the Opti timization Problem

  • Find w and b such that
  • (w) =

1 2 . 𝑥𝑢. 𝑥 is minimized

For all {(xi , yi )}: yi 𝑥𝑈 + 𝑦𝑗 + 𝑐 ≥ 1 The solution involves construction a dual problem where a Lagrange multiplier I is associated with every constraint in the primary problem:

slide-15
SLIDE 15

Data taset t wit ith noise Problem ?

slide-16
SLIDE 16

Soft ft Margin Cla lassification

  • Slack variables can be added to allow misclassification of

difficult or noisy data

What should be our quadratic

  • ptimization criterion be ?

Minimize

1 2 ∗ 𝑥𝑈 ∗ 𝑥 + 𝐷

𝐿=1 𝑆

𝜁

slide-17
SLIDE 17

Hard vs. . Soft ft Margin SVM

  • Hard-margin doesn’t require to guess the cost parameter

(requires no parameters at all)

  • Soft-margin also always has a solution
  • Soft –margin is more robust to outliers

Smoother surfaces (in non –liner cases)

slide-18
SLIDE 18

Alg lgorith thm

slide-19
SLIDE 19

SVM Applications

  • SVM has been used successfully in many

real world applications

  • Text ( and hypertext ) categorization
  • Image classification
  • Bioinformatics (protein classification, cancer

classification)

  • Hand-written char. classification
slide-20
SLIDE 20