support vector machines and kernels
play

Support vector machines and kernels Thurs April 19 Kristen Grauman - PDF document

CS 376: Computer Vision - lecture 24 4/19/2018 Support vector machines and kernels Thurs April 19 Kristen Grauman UT Austin Last time Sliding window object detection wrap-up Attentional cascade Applications / examples Pros and


  1. CS 376: Computer Vision - lecture 24 4/19/2018 Support vector machines and kernels Thurs April 19 Kristen Grauman UT Austin Last time • Sliding window object detection wrap-up • Attentional cascade • Applications / examples • Pros and cons Today • Supervised classification continued • Nearest neighbors • Support vector machines • HoG pedestrians example • Kernels • Multi-class from binary classifiers • Pyramid match kernels • Evaluation • Scoring an object detector • Scoring a multi-class recognition system 1

  2. CS 376: Computer Vision - lecture 24 4/19/2018 Nearest Neighbor classification • Assign label of nearest training data point to each test data point Black = negative Novel test example Red = positive Closest to a positive example from the training set, so classify it as positive. from Duda et al. Voronoi partitioning of feature space for 2-category 2D data K-Nearest Neighbors classification • For a new point, find the k closest points from training data • Labels of the k points “vote” to classify k = 5 Black = negative If query lands here, the 5 Red = positive NN consist of 3 negatives and 2 positives, so we classify it as negative. Source: D. Lowe Three case studies Boosting + face SVM + person NN + scene Gist detection classification detection e.g., Hays & Efros e.g., Dalal & Triggs Viola & Jones 2

  3. CS 376: Computer Vision - lecture 24 4/19/2018 Where in the World? [Hays and Efros. im2gps : Estimating Geographic Information from a Single Image. CVPR 2008.] 6+ million geotagged photos by 109,788 photographers Annotated by Flickr users Spatial Envelope Theory of Scene Representation Oliva & Torralba (2001) A scene is a single surface that can be represented by global (statistical) descriptors Slide Credit: Aude Olivia 3

  4. CS 376: Computer Vision - lecture 24 4/19/2018 Global texture: capturing the “Gist” of the scene Capture global image properties while keeping some spatial information Gist descriptor Oliva & Torralba IJCV 2001, Torralba et al. CVPR 2003 Which scene properties are relevant? • Gist scene descriptor • Color Histograms - L*A*B* 4x14x14 histograms • Texton Histograms – 512 entry, filter bank based • Line Features – Histograms of straight line stats Im2gps: Scene Matches [Hays and Efros. im2gps : Estimating Geographic Information from a Single Image. CVPR 2008.] 4

  5. CS 376: Computer Vision - lecture 24 4/19/2018 Im2gps: Scene Matches [Hays and Efros. im2gps : Estimating Geographic Information from a Single Image. CVPR 2008.] [Hays and Efros. im2gps : Estimating Geographic Information from a Single Image. CVPR 2008.] 5

  6. CS 376: Computer Vision - lecture 24 4/19/2018 Scene Matches [Hays and Efros. im2gps : Estimating Geographic Information from a Single Image. CVPR 2008.] [Hays and Efros. im2gps : Estimating Geographic Information from a Single Image. CVPR 2008.] Quantitative Evaluation Test Set … 6

  7. CS 376: Computer Vision - lecture 24 4/19/2018 The Importance of Data [Hays and Efros. im2gps : Estimating Geographic Information from a Single Image. CVPR 2008.] Nearest neighbors: pros and cons • Pros : – Simple to implement – Flexible to feature / distance choices – Naturally handles multi-class cases – Can do well in practice with enough representative data • Cons: – Large search problem to find nearest neighbors – Storage of data – Must know we have a meaningful distance function Kristen Grauman Three case studies Boosting + face SVM + person NN + scene Gist detection classification detection e.g., Hays & Efros e.g., Dalal & Triggs Viola & Jones 7

  8. CS 376: Computer Vision - lecture 24 4/19/2018 Linear classifiers Linear classifiers • Find linear function to separate positive and negative examples    x positive : x w b 0 i i x negative : x  w  b  0 i i Which line is best? Support Vector Machines (SVMs) • Discriminative classifier based on optimal separating line (for 2d case) • Maximize the margin between the positive and negative training examples 8

  9. CS 376: Computer Vision - lecture 24 4/19/2018 Support vector machines • Want line that maximizes the margin.     x positive ( y 1) : x w b 1 i i i       x negative ( y 1) : x w b 1 i i i     x i w b 1 For support, vectors, Support vectors Margin C. Burges, A Tutorial on Support Vector Machines for Pattern Recognition, Data Mining and Knowledge Discovery, 1998 Support vector machines • Want line that maximizes the margin.     x positive ( y 1) : x w b 1 i i i       x negative ( y 1) : x w b 1 i i i x i w   b   1 For support, vectors,   | x w b | Distance between point i and line: || w || For support vectors: w Τ x  b  1  1 1 2     M Support vectors w w Margin M w w w Finding the maximum margin line 1. Maximize margin 2/|| w || 2. Correctly classify all training data points:     x positive ( y 1) : x w b 1 i i i       x negative ( y 1) : x w b 1 i i i Quadratic optimization problem : 1 w T Minimize w 2 Subject to y i ( w · x i + b ) ≥ 1 9

  10. CS 376: Computer Vision - lecture 24 4/19/2018 Finding the maximum margin line    w i y x • Solution: i i i learned Support weight vector C. Burges, A Tutorial on Support Vector Machines for Pattern Recognition, Data Mining and Knowledge Discovery, 1998 Finding the maximum margin line    w i y x • Solution: i i i b = y i – w · x i (for any support vector)        w x b y x x b i i i i • Classification function:    f ( x ) sign ( w x b)        sign y x x b i i i i If f(x) < 0, classify as negative, if f(x) > 0, classify as positive C. Burges, A Tutorial on Support Vector Machines for Pattern Recognition, Data Mining and Knowledge Discovery, 1998 Person detection with HoG’s & linear SVM’s • Histograms of oriented gradients (HoG): Map each grid cell in the input window to a histogram counting the gradients per orientation. • Train a linear SVM using training set of pedestrian vs. non-pedestrian windows. Dalal & Triggs, CVPR 2005 10

  11. CS 376: Computer Vision - lecture 24 4/19/2018 Person detection with HoGs & linear SVMs • Histograms of Oriented Gradients for Human Detection, Navneet Dalal, Bill Triggs, International Conference on Computer Vision & Pattern Recognition - June 2005 • http://lear.inrialpes.fr/pubs/2005/DT05/ Understanding classifier mistakes Carl Vondrick http://web.mit.edu/vondrick/ihog/slides.pdf 11

  12. CS 376: Computer Vision - lecture 24 4/19/2018 HOGgles: Visualizing Object Detection Features Carl Vondrick, MIT; Aditya Khosla; Tomasz Malisiewicz; Antonio Torralba, MIT http://web.mit.edu/vondrick/ihog/slides.pdf HOGGLES: Visualizing Object Detection Features HOGgles: Visualizing Object Detection Features Carl Vondrick, MIT; Aditya Khosla; Tomasz Malisiewicz; Antonio Torralba, MIT http://web.mit.edu/vondrick/ihog/slides.pdf HOGGLES: Visualizing Object Detection Features HOGgles: Visualizing Object Detection Features Carl Vondrick, MIT; Aditya Khosla; Tomasz Malisiewicz; Antonio Torralba, MIT http://web.mit.edu/vondrick/ihog/slides.pdf 12

  13. CS 376: Computer Vision - lecture 24 4/19/2018 HOGGLES: Visualizing Object Detection Features HOGgles: Visualizing Object Detection Features; Carl Vondrick, MIT; Aditya Khosla; Tomasz Malisiewicz; Antonio Torralba, MIT http://web.mit.edu/vondrick/ihog/slides.pdf HOGGLES: Visualizing Object Detection Features HOGgles: Visualizing Object Detection Features; ICCV 2013 Carl Vondrick, MIT; Aditya Khosla; Tomasz Malisiewicz; Antonio Torralba, MIT http://web.mit.edu/vondrick/ihog/slides.pdf HOGGLES: Visualizing Object Detection Features http://carlvondrick.com/ihog/ 13

  14. CS 376: Computer Vision - lecture 24 4/19/2018 Questions • What if the data is not linearly separable? Non-linear SVMs  Datasets that are linearly separable with some noise work out great: x 0  But what are we going to do if the dataset is just too hard? x 0  How about … mapping data to a higher-dimensional space: x 2 0 x Non-linear SVMs: feature spaces  General idea: the original input space can be mapped to some higher-dimensional feature space where the training set is separable: Φ: x→ φ( x ) Slide from Andrew Moore’s tutorial: http://www.autonlab.org/tutorials/svm.html 14

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend