cs473
play

CS473 CS-473 Text Categorization (II) Luo Si Department of - PowerPoint PPT Presentation

CS473 CS-473 Text Categorization (II) Luo Si Department of Computer Science Purdue University Text Categorization (IV) Outline Support Vector Machine (SVM) A Large-Margin Classifier Introduction to SVM Linear, hard margin


  1. CS473 CS-473 Text Categorization (II) Luo Si Department of Computer Science Purdue University

  2. Text Categorization (IV) Outline  Support Vector Machine (SVM) A Large-Margin Classifier  Introduction to SVM  Linear, hard margin  Linear, Soft margin  Non-Linear SVM  Discussion

  3. History of SVM A brief history of SVM  SVM is inspired from statistical learning theory by Vapnik (1979) [3]  Put into practical application as “Large Margin Classifiers” in (1992) [1]  SVM became famous for its success in handwritten digit recognition [2]  SVM has been successfully utilized in  Image detection  Speaker identification  Text categorization  Many other problems… [1] B.E. Boser et al . A Training Algorithm for Optimal Margin Classifiers. Proceedings of the Fifth Annual Workshop on Computational Learning Theory 5 144-152, Pittsburgh, 1992. [2] L. Bottou et al . Comparison of classifier methods: a case study in handwritten digit recognition. Proceedings of the 12th IAPR International Conference on Pattern Recognition, vol. 2, pp. 77-82, 1994. [3] V. Vapnik. The Nature of Statistical Learning Theory. 2 nd edition, Springer, 1999.

  4. Support Vector Machine Consider a two-class (binary classification problem like text categorization), find a line to separate data points in two classes There are many possible solutions! Are those decision boundaries equally good?

  5. Support Vector Machine A slight variation of the data makes some decision boundaries incorrect

  6. Large-Margin Decision Criterion The decision boundary should be far away from the data points of two classes as much as possible Indicates the margin between data points and the decision boundary should be large Margin Positive and Negative Data points have equal margin

  7. Large-Margin Decision Criterion Margin Closest positive data point to boundary   T W X b 1 i Closest negative data point to boundary    T W X b 1 j The margin is:

  8. Linear SVM Let {x 1 , ..., x n } denote input data. For example, vector representation of all documents Let y i be the binary indicator 1 or -1 that indicates whether x i belongs to a particular category c or not The decision boundary should classify all points correctly The decision boundary can be found by solving the following constrained optimization problem

  9. Hard Margin Linear SVM Solution The optimal parameters are    * w y X i i i  i SV     * y W X ( b ) 1 i SV i i Prediction is made by:       sign WX ( b ) sign ( y X ( X ) b ) i i i  i SV

  10. Soft Margin Linear SVM Solution What about linearly non-separable data?

  11. Soft Margin Linear SVM Solution We tolerate some error for specific data points as  2  1

  12. Soft Margin Linear SVM Introduction “slack variables”, slack variables are always positive Introduce const C to balance error for linear boundary and the margin The optimization problem becomes

  13. Non-linear SVM Linear SVM only uses a line to separate data points, how to generalize it to non-linear case? Key idea: transform X i to a higher dimension space  Input space: the space the point x i are located  Feature space: the space of f( x i ) after transformation

  14. Non-linear SVM Key idea: transform X i to a higher dimension space x 2 x 1 =0

  15. Non-linear SVM Key idea: transform X i to a higher dimension space  Input space: the space the point x i are located  Feature space: the space after transformation Use Ф ( x i ) to transform low level feature to high level feature Sometimes, the Ф ( x i ) transformation maps to very high dimensional space or even infinite dimensional space

  16. Text Categorization: Evaluation Performance of different algorithms on Reuters-21578 corpus: 90 categories, 7769 Training docs, 3019 test docs, (Yang, JIR 1999)

  17. SVM Toolkit SMO: Sequential Minimal Optimization SVM-Light LibSVM BSVM ……

  18. Text Categorization (II) Outline  Support Vector Machine (SVM) A Large-Margin Classifier  Introduction to SVM  Linear, hard margin  Linear, Soft margin  Non-Linear SVM  Discussion

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend