Support vector machines and kernels Thurs Nov 19 Kristen Grauman - PDF document

11/18/2015 Support vector machines and kernels Thurs Nov 19 Kristen Grauman UT Austin Last time • Sliding window object detection pros and cons • Attentional cascade • Object proposals for detection • Nearest neighbor classification • Scene recognition example with global descriptors 1

11/18/2015 Today • HMM examples • Support vector machines (SVM) • Basic algorithm • Kernels • Structured input spaces: Pyramid match kernels • Multi-class • HOG + SVM for person detection • Visualizing a feature: Hoggles • Evaluating an object detector Window-based models: Three case studies Boosting + face SVM + person NN + scene Gist detection detection classification e.g., Hays & Efros e.g., Dalal & Triggs Viola & Jones Slide credit: Kristen Grauman 2

11/18/2015 Recall: Nearest Neighbor classification • Assign label of nearest training data point to each test data point Black = negative Novel test example Red = positive Closest to a positive example from the training set, so classify it as positive. from Duda et al. Voronoi partitioning of feature space for 2-category 2D data 6+ million geotagged photos by 109,788 photographers Annotated by Flickr users Slide credit: James Hays 3

11/18/2015 Im2gps: Scene Matches Slide credit: James Hays [Hays and Efros. im2gps : Estimating Geographic Information from a Single Image. CVPR 2008.] Slide credit: James Hays 4

11/18/2015 The Importance of Data [Hays and Efros. im2gps : Estimating Geographic Information from a Single Image. CVPR 2008.] Slide credit: James Hays HMM example: Photo Geo-location Where was this picture taken? Slide credit: Kristen Grauman 5

11/18/2015 Example: Photo Geo-location Where was this picture taken? Slide credit: Kristen Grauman Example: Photo Geo-location Where was this picture taken? Slide credit: Kristen Grauman 6

11/18/2015 Example: Photo Geo-location Where was each picture in this sequence taken? Slide credit: Kristen Grauman Idea: Exploit the beaten path • Learn dynamics model from “training” tourist photos • Exploit timestamps and sequences for novel “test” photos [Chen & Grauman CVPR 2011] Slide credit: Kristen Grauman 7

11/18/2015 Discovering a city’s locations Define states with data-driven approach: New York mean shift clustering on the GPS coordinates of the training images Observation model P(Observation | State) = P( | Liberty Island) P(L 2 |L 2 ) Location 2 P(L 3 |L 2 ) P(L 2 |L 1 ) P(S 2 |S 3 ) P(L 1 |L 2 ) Location 1 Location 3 P(L 1 |L 1 ) P(L 3 |L 1 ) P(L 3 |L 3 ) P(L 1 |L 3 ) Slide credit: Kristen Grauman 9

11/18/2015 Observation model Slide credit: Kristen Grauman Location estimation accuracy Slide credit: Kristen Grauman 10

11/18/2015 Qualitative Result – New York Slide credit: Kristen Grauman Discovering travel g uides’ beaten paths Routes from travel guide book for New York vs. Random walks in learned HMM Slide credit: Kristen Grauman 11

11/18/2015 Video textures • Schodl, Szeliski, Salesin, Essa; Siggraph 2000. • http://www.cc.gatech.edu/cpl/projects/videotexture/ Today • HMM examples • Support vector machines (SVM) – Basic algorithm – Kernels • Structured input spaces: Pyramid match kernels – Multi-class – HOG + SVM for person detection • Visualizing a feature: Hoggles • Evaluating an object detector 12

11/18/2015 Window-based models: Three case studies Boosting + face SVM + person NN + scene Gist detection detection classification e.g., Hays & Efros e.g., Dalal & Triggs Viola & Jones Slide credit: Kristen Grauman Linear classifiers 13

11/18/2015 Linear classifiers • Find linear function to separate positive and negative examples    positive : b 0 x x w i i    negative : 0 b x x w i i Which line is best? Support Vector Machines (SVMs) • Discriminative classifier based on optimal separating line (for 2d case) • Maximize the margin between the positive and negative training examples 14

11/18/2015 Support vector machines • Want line that maximizes the margin.     positive ( 1) : 1 y b x x w i i i       negative ( y 1) : b 1 x x w i i i     1 b For support, vectors, x i w Support vectors Margin C. Burges, A Tutorial on Support V ector Machines for Pattern Recognition, Data Mining and Knowledge Discovery , 1998 Support vector machines • Want line that maximizes the margin.     positive ( 1) : 1 y b x x w i i i       negative ( 1) : 1 y b x x w i i i     1 b For support, vectors, x i w   | b | Distance between point x w i and line: || || w For support vectors: Τ  b   1 1 1 2 w x     M Support vectors Margin M w w w w w C. Burges, A Tutorial on Support Vector Machines f or Pattern Recognition, Data Mining and Knowledge Discov ery, 15

11/18/2015 Support vector machines • Want line that maximizes the margin.     positive ( 1) : 1 y b x x w i i i       negative ( y 1) : b 1 x x w i i i     1 b For support, vectors, x i w   | | b Distance between point x w i and line: || || w Therefore, the margin is 2 / || w || Support vectors Margin M C. Burges, A Tutorial on Support Vector Machines f or Pattern Recognition, Data Mining and Knowledge Discov ery, Finding the maximum margin line 1. Maximize margin 2/|| w || 2. Correctly classify all training data points:     positive ( 1) : 1 y b x x w i i i       negative ( 1) : 1 y b x x w i i i Quadratic optimization problem : 1 w T Minimize w 2 Subject to y i ( w · x i + b ) ≥ 1 C. Burges, A Tutorial on Support Vector Machines f or Pattern Recognition, Data Mining and Knowledge Discov ery, 16

11/18/2015 Finding the maximum margin line    • Solution: i y x w i i i learned Support weight vector C. Burges, A Tutorial on Support Vector Machines f or Pattern Recognition, Data Mining and Knowledge Discov ery, Finding the maximum margin line    • Solution: i y x w i i i b = y i – w · x i (for any support vector)        b y b w x x x i i i i • Classification function:    ( ) sign ( b) f x w x        sign y b x x i i i i If f(x) < 0, classify as negative, if f(x) > 0, classify as positive C. Burges, A Tutorial on Support Vector Machines f or Pattern Recognition, Data Mining and Knowledge Discov ery, 19 17

11/18/2015 Questions • What if the data is not linearly separable? What if the data is not linearly separable? 1 2    min subject to ( ) 1 y b • w w x Separable: i i 2 , b w   1 n 2  min C • Non-separable: w i 2 , b w  i 1       subject to ( ) 1 0 y b w x i i i • C : tradeoff constant, ξ i : slack variable (positive) • Whenever margin is ≥ 1, ξ i = 0      1 ( ) • y b Whenever margin is < 1, w x i i i Lana Lazebnik 18

11/18/2015 Today • HMM examples • Support vector machines (SVM) – Basic algorithm – Kernels • Structured input spaces: Pyramid match kernels – Multi-class – HOG + SVM for person detection • Visualizing a feature: Hoggles • Evaluating an object detector Non-linear SVMs  Datasets that are linearly separable with some noise work out great: x 0  But what are we going to do if the dataset is just too hard? x 0  How about … mapping data to a higher-dimensional space: x 2 0 x 19

11/18/2015 Non-linear SVMs: feature spaces  General idea: the original input space can be mapped to some higher-dimensional feature space where the training set is separable: Φ : x → φ ( x ) Slide f rom Andrew Moore’s tutorial: http://www .autonlab.org/tutorials/sv m.html Nonlinear SVMs • The kernel trick : instead of explicitly computing the lifting transformation φ ( x ), define a kernel function K such that j ) = φ ( x i ) · φ ( x j ) K ( x i , x j • This gives a nonlinear decision boundary in the original feature space:    ( , ) y K b x x i i i i 20

Support vector machines and kernels Thurs Nov 19 Kristen Grauman - PDF document

11/18/2015 Support vector machines and kernels Thurs Nov 19 Kristen Grauman UT Austin Last time Sliding window object detection pros and cons Attentional cascade Object proposals for detection Nearest neighbor classification

Vector addition: The zero vector The D -vector whose entries are all zero is the zero vector ,

Why Deep Learning Is More Natural Questions Efficient than Support Support Vector . . . Support

Matrix and Vector Operations Matrix and Vector Operations 1 / 21 Matrix and Vector Operations

Day 3 Advanced Vector Architectures Session A: Vector Instruction Execution Pipelines Break

Support Vector Machines Preview What is a support vector machine? The perceptron revisited

? 17.10.2018 3 17.10.2018 4 Support Vector Machines (SVM): Background Support Vector Machines

Support Vector Machines October 16, 2018 Support Vector Machines October 16, 2018 1 / 31

Relevance Vector Machines Jukka Lankinen LUT February 21, 2011 Jukka Lankinen Relevance Vector

Lecture 11 Vector Linear Network Coding Vector Linear Network Coding Outline Fundamentals for

. Vector Graphics Introduction to Web Design Vector graphics contain geometric objects, such as

Class 7: Vector and scalar, components Vector operations in components Multiplying a vector with a

Vector Functions A vector function is simply a function whose codomain is R n . In other words,

Vector Field Topology 8-1 Ronald Peikert SciVis 2007 - Vector Field Topology Vector fields as

Linear Algebra Vectors A column vector is a list of numbers stored vertically. The dimen-

vector class homogeneous aggregate with random access templated class: Vector<int>

Vector/Axial-vector Technical stuff: - use POWHEG-BOX process of pp-->DM DM 1j at NLO (need

Eclipse MicroProfile Starter CONFIDENTIAL Designator with Quarkus OpenAlt 2019 Michal Karm

Liberty Leading the Vegetables (Delacroix) The Anatomy Lesson of Dr. Pickled Cabbage (Rembrandt)

Machine Translation and Sequence-to-sequence Models http://phontron.com/class/mtandseq2seq2018/

CENG3420 Lecture 01: Introduction Bei Yu byu@cse.cuhk.edu.hk (Latest update: January 10, 2018)

Dependency Grammars Topological Dependency Trees: A Constraint-based Account of Linear

I have nothing to disclose Advances Roberto Lucchini, MD Division Occupational Medicine

RNN Input Layer RNN Hidden Layer RNN h t-1 h t x t (Picture adapted from Andrej

Introduction to HPSG Class 1: Clause Structure, Hierarchical Organization of Knowledge, Lexical

Support vector machines and kernels Thurs Nov 19 Kristen Grauman - PDF document

11/18/2015 Support vector machines and kernels Thurs Nov 19 Kristen Grauman UT Austin Last time Sliding window object detection pros and cons Attentional cascade Object proposals for detection Nearest neighbor classification

Vector addition: The zero vector The D -vector whose entries are all zero is the zero vector ,

Why Deep Learning Is More Natural Questions Efficient than Support Support Vector . . . Support

Matrix and Vector Operations Matrix and Vector Operations 1 / 21 Matrix and Vector Operations

Day 3 Advanced Vector Architectures Session A: Vector Instruction Execution Pipelines Break

Support Vector Machines Preview What is a support vector machine? The perceptron revisited

? 17.10.2018 3 17.10.2018 4 Support Vector Machines (SVM): Background Support Vector Machines

Support Vector Machines October 16, 2018 Support Vector Machines October 16, 2018 1 / 31

Relevance Vector Machines Jukka Lankinen LUT February 21, 2011 Jukka Lankinen Relevance Vector

Lecture 11 Vector Linear Network Coding Vector Linear Network Coding Outline Fundamentals for

. Vector Graphics Introduction to Web Design Vector graphics contain geometric objects, such as

Class 7: Vector and scalar, components Vector operations in components Multiplying a vector with a

Vector Functions A vector function is simply a function whose codomain is R n . In other words,

Vector Field Topology 8-1 Ronald Peikert SciVis 2007 - Vector Field Topology Vector fields as

Linear Algebra Vectors A column vector is a list of numbers stored vertically. The dimen-

vector class homogeneous aggregate with random access templated class: Vector&lt;int&gt;

Vector/Axial-vector Technical stuff: - use POWHEG-BOX process of pp--&gt;DM DM 1j at NLO (need

Eclipse MicroProfile Starter CONFIDENTIAL Designator with Quarkus OpenAlt 2019 Michal Karm

Liberty Leading the Vegetables (Delacroix) The Anatomy Lesson of Dr. Pickled Cabbage (Rembrandt)

Machine Translation and Sequence-to-sequence Models http://phontron.com/class/mtandseq2seq2018/

CENG3420 Lecture 01: Introduction Bei Yu byu@cse.cuhk.edu.hk (Latest update: January 10, 2018)

Dependency Grammars Topological Dependency Trees: A Constraint-based Account of Linear

I have nothing to disclose Advances Roberto Lucchini, MD Division Occupational Medicine

RNN Input Layer RNN Hidden Layer RNN h t-1 h t x t (Picture adapted from Andrej

Introduction to HPSG Class 1: Clause Structure, Hierarchical Organization of Knowledge, Lexical

vector class homogeneous aggregate with random access templated class: Vector<int>

Vector/Axial-vector Technical stuff: - use POWHEG-BOX process of pp-->DM DM 1j at NLO (need