training linear svms
play

Training Linear SVMs By - Thorsten Joachims Prasad Seemakurthi - PowerPoint PPT Presentation

Training Linear SVMs By - Thorsten Joachims Prasad Seemakurthi Agenda What is SVM Kernel Hard Margins Soft Margins Linear Algorithm Few Examples Conclusion SVM Curtain Rais iser Linear Classification


  1. Training Linear SVMs’ By - Thorsten Joachims Prasad Seemakurthi

  2. Agenda • What is SVM • Kernel • Hard Margins • Soft Margins • Linear Algorithm • Few Examples • Conclusion

  3. SVM – Curtain Rais iser • Linear Classification Algorithm • SVM have a clever way to prevent over-fitting • SVMs have a very clever way to use huge number of features nearly as much as computation as seems to be necessary

  4. Lin inear Cla lassifiers (I (Intuition) Y (est)

  5. Lin inear Cla lassifiers Y (est) denotes +1 Any of these denotes -1 would be fine … But which is best … ?

  6. Linear Classifier

  7. Maximum Margin 1. Maximizing margin is good according to intuition and PAC theory 2. Implies only support vectors denotes +1 are important denotes -1 3. Empirically works well Classifier with the maximum margin Support This kind of simplest kind of Vectors SVM is called Linear SVM

  8. Maximizing th the margin .

  9. Why maximize th the margin? Points near decision surface -----> Uncertain classification decision (50% either way) A classifier with a large margin make no low classification decision Gives classification safety margin w.r.t slight errors in measurement

  10. Why maximize th the margin? • SVM Classifier : Large Margin around Decision boundary • Compare to decision hyperplane: Place a fat separator between classes • Fewer choices of where it can be put • Decreased memory capacity • Increased ability to correctly generalize the test data

  11. Lin inear SVM math themati tically

  12. Lin inear SVM math themati tically

  13. Lin inear (h (hard – Margin ) ) SVM – fo formulation

  14. Solv lving th the Opti timization Problem • Find w and b such that 1 •  (w) = 2 . 𝑥 𝑢 . 𝑥 is minimized For all {(x i , y i )}: y i 𝑥 𝑈 + 𝑦𝑗 + 𝑐 ≥ 1 The solution involves construction a dual problem where a Lagrange multiplier  I is associated with every constraint in the primary problem:

  15. Data taset t wit ith noise Problem ?

  16. Soft ft Margin Cla lassification • Slack variables can be added to allow misclassification of difficult or noisy data What should be our quadratic optimization criterion be ? Minimize 𝑆 1 2 ∗ 𝑥 𝑈 ∗ 𝑥 + 𝐷 𝜁 𝐿=1

  17. Hard vs. . Soft ft Margin SVM • Hard- margin doesn’t require to guess the cost parameter (requires no parameters at all) • Soft-margin also always has a solution • Soft – margin is more robust to outliers Smoother surfaces (in non – liner cases)

  18. Alg lgorith thm

  19. SVM Applications • SVM has been used successfully in many real world applications • Text ( and hypertext ) categorization • Image classification • Bioinformatics (protein classification, cancer classification) • Hand-written char. classification

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend