FeaturesforComputerVision AlexBerg ComputerScienceDepartment - PowerPoint PPT Presentation

Example Feature Pipeline  Edge  Eliminate rotaHonal   OrientaHon   Extract affine regions  Normalize regions  Histograms  ambiguity  SIFT (Lowe ’04)  Harris‐Affine Region of Interest Operator                                         Lowe’s Descriptor  Features! 

Matching for Alignment  Use descriptors to compare features and enforce geometric constraints 

Match a few points 

Dense Alignment 

Si^ 2 

Example Feature Pipeline  Edge  Eliminate rotaHonal   OrientaHon   ambiguity  Histograms  Needs to be handled  Remaining variaHon  here  here 

Matching affine covariant regions  Note that they sHll don’t look exactly the same even on easy images!  Lowe’s orientaHon histogram helps, but Grauman & Darrell and Lazebnik et  al  have a neat alternaHve 

Embedding 

Grauman’s Pyramid Match Kernel  “Match” score for sets X, Y,  of features:  Idea from StaHsHcs: Mallow’s 1972  Included the method of quanHzing   feature space, which was rediscovered by  Rubner et al 1998 as the  Earth Mover’s Distance. 

Grauman’s Pyramid Match Kernel  Indyk and Thaper 2003  “Match” score for sets X, Y,  Showed how to embed   of features:  points in a mulHscale pyramid  so that the l2 norm on the   embedding approximated   EMD  Idea from StaHsHcs: Mallow’s 1972  Included the method of quanHzing   feature space, which was rediscovered by  Rubner et al 1998 as the  Earth Mover’s Distance (EMD) 

Grauman’s Pyramid Match Kernel  Indyk and Thaper 2003  “Match” score for sets X, Y,  Showed how to embed   of features:  points in a mulHscale pyramid  so that the l2 norm on the   embedding approximated   EMD  Grauman replaced l2 with   histogram intersecHon.  Histogram IntersecHon / Min Kernel is posiHve definite, so we can use it for a Kernelized SVM 

SpaHal Pyramid Match (Lazebnik)  Only use pyramid for the spaHal coordinates of features. 

SpaHal Pyramid Match (Lazebnik)  Applied to large region or whole image,  No interest point operator. 

RotaHon / scale invariance not always  needed.  Airplanes on the runway are level. 

SpaHal Pyramid Kernel (Lazebnik)  DistribuHon of  edge features x, y, orientaHon, energy  E ( x, y, o ) = Edge energy at x,y in orientaHon o  Histograms are just sums of different slices of E  (just a linear projecHon if E is represented discretely) 

SpaHal Pyramid Kernel (Lazebnik)  DistribuHon of  edge features x, y, orientaHon, energy  E ( x, y, o ) = Edge energy at x,y in orientaHon o  Histograms are just sums of different slices of E  (just a linear projecHon if E is represented discretely)  Same for GIST, Shape Contexts, Geometric Blur, HOG etc.  The only impediment to an understanding of all of these features as   simple projecHons of something like E() above is the min kernel… 

Unified Feature Pipeline  Comparison  Image   Edges/filter responses  Contrast  ProjecHon  NormalizaHon  L2  Inner product  Min Kernel 

Max‐Margin Addi2ve Classifiers for Detec2on     Subhransu Maji (UC Berkeley)  Alex Berg (Columbia University)  Will be a talk at ICCV 2009 in Kyoto 

DetecHon  Find pedestrians 

DetecHon  10 4  to 10 6  or more    windows per image  Find pedestrians 

DetecHon  10 4  to 10 6  or more    windows per image  BoosHng + Decision Trees  Viola & Jones  (faces)  Linear Classifier   Dalal & Triggs  (pedestrians)  Neural Networks  Find pedestrians  Rowley et al  (faces) 

ClassificaHon  What is this? 

ClassificaHon  What is this?  Choose from many categories 

ClassificaHon  ~10 5  examples images (training)  What is this?  Choose from many categories 

ClassificaHon  ~10 5  examples images (training)  Nearest Neighbor  Berg  (Caltech 101)  What is this?  Kernelized SVM  Choose from many categories  Grauman et al  (Caltech 101)  CombinaHon of SVMs  Varma et al  (Caltech 101)  (skipping model based methods) 

ClassificaHon  ~10 5  examples images (training)  Nearest Neighbor  3sec / comparison  Berg  (Caltech 101)  What is this?  0.001 sec / comparison  Kernelized SVM  Choose from many categories  Grauman et al  (Caltech 101)  Slow?  CombinaHon of SVMs  Varma et al  (Caltech 101)  (skipping model based methods)  Caltech 101 – Fei‐Fei Li, Pietro Perona 2004 

DetecHon  ClassificaHon  Linear Classifier  Kernelized SVM Classifier 

DetecHon  ClassificaHon  Linear Classifier  Kernelized SVM Classifier  #sv � #dimensions � � � α j K ( x, x j ) + b h ( x ) = h ( x ) = + b w i x i j =1 i =1 Decision funcHon is  sign(h)  Decision funcHon is  sign(h) 

DetecHon  ClassificaHon  Linear Classifier  Kernelized SVM Classifier  O(#dims)  O(#dims x #sv )  #sv � #dimensions � � � α j K ( x, x j ) + b h ( x ) = h ( x ) = + b w i x i j =1 i =1 Test feature vector  Kernel FuncHon  Support Vector  (comparison)  (training example)  One coordinate of   feature vector 

DetecHon  ClassificaHon  Linear Classifier  Kernelized SVM Classifier  O(#dims)  O(#dims x #sv )  #sv � #dimensions � � � α j K ( x, x j ) + b h ( x ) = h ( x ) = + b w i x i j =1 i =1 Feature vector  Kernel FuncHon  Support Vector  (comparison)  (training example)  One coordinate of   feature vector 

A SVM with  Addi8ve  kernel can be  evaluated efficiently  Maji, Berg, Malik CVPR 2008  #dimensions If  � K ( a, b ) = K i ( a i , b i ) i =1 #sv Then  � α j K ( x, x j ) + b h ( x ) = j =1 #sv � #dimensions � � � K i ( x i , x j α j = i ) + b j =1 i =1 #dimensions � = h i ( x i ) i =1

A SVM with AddiHve Kernel can be  Evaluated Efficiently  Maji, Berg, Malik CVPR 2008  #dimensions If  If you have an addiHve  � K ( a, b ) = K i ( a i , b i ) kernel…  i =1 #sv Then  � then the SVM decision  α j K ( x, x j ) + b h ( x ) = funcHon is addiHve.  j =1 #sv � #dimensions � � � K i ( x i , x j α j = i ) + b j =1 i =1 #dimensions � = h i ( x i ) i =1

A SVM with AddiHve Kernel can be  Evaluated Efficiently  Maji, Berg, Malik CVPR 2008  #dimensions If  If you have an addiHve  � K ( a, b ) = K i ( a i , b i ) kernel…  i =1 #sv Then  � then the SVM decision  α j K ( x, x j ) + b h ( x ) = funcHon is addiHve.  j =1 #sv � #dimensions � � � K i ( x i , x j α j = i ) + b j =1 i =1 #dimensions � = h i ( x i ) i =1 Evaluate these 1D funcHons efficiently using  a look up table, spline (exact or approximate)  

IntersecHon  or Min Kernel  Maji, Berg, Malik CVPR 2008  #dimensions The IntersecHon or Min Kernel  � K min ( a, b ) = min ( a i , b i ) i =1 Grauman et al use this on MulHscale   Histograms to approximate the linear   assignment problem  (and do recogniHon)  Lazebnik et al refine this approach to only  use mulHple scales for  posiHon, and not  for the features  Much follow on work 

IntersecHon  or Min Kernel  Maji, Berg, Malik CVPR 2008  #dimensions The IntersecHon or Min Kernel  � K min ( a, b ) = min ( a i , b i ) i =1 #sv � α j K min ( x, x j ) + b h ( x ) = j =1 #sv � #dimensions � � � min( x i , x j α j = i ) + b j =1 i =1 #dimensions � = h i ( x i ) + b # sv i =1 α j min( x i , x j Where  � h i ( x i ) = i ) j =1

IntersecHon  or Min Kernel  Maji, Berg, Malik CVPR 2008  #dimensions The IntersecHon or Min Kernel  � K min ( a, b ) = min ( a i , b i ) i =1 #sv � α j K min ( x, x j ) + b h ( x ) = j =1 #sv � #dimensions � The support vectors are constants,  � � min( x i , x j α j = i ) + b min(  x i   ,  constant  ) is piecewise linear,  j =1 i =1 so  h i (x i )  is piecewise linear.  #dimensions � = h i ( x i ) + b # sv i =1 α j min( x i , x j Where  � h i ( x i ) = i ) j =1

IntersecHon  or Min Kernel  Maji, Berg, Malik CVPR 2008  #dimensions The IntersecHon or Min Kernel  � K min ( a, b ) = min ( a i , b i ) i =1 O( #dims x #sv )  Becomes   O( #dims x log(#sv) )  exact  #sv � α j K min ( x, x j ) + b or  O( #dims  )  h ( x ) = approx.  j =1 #sv � #dimensions � The support vectors are constants,  � � min( x i , x j α j = i ) + b min(  x i   ,  constant  ) is piecewise linear,  j =1 i =1 so  h i (x i )  is piecewise linear.  #dimensions � = h i ( x i ) + b # sv i =1 α j min( x i , x j Where  � h i ( x i ) = i ) j =1

Time to Perform ClassificaHon  Maji, Berg, Malik CVPR 2008  Times in seconds to classify 10,000 test vectors 

MulHscale HOG features  (Very Similar to SpaHal Pyramids)  Based on histograms of response to eight orientated edge detecHons. Non‐ overlapping windows of integraHon and fixed size windows for contrast   normalizaHon allow efficient computaHon. 

Example  h i (x i )  and ApproximaHons 

Min Kernel “Beger” than Linear 

Min Kernel “Beger” than Linear  Caltech 101 with “simple features”     Linear SVM                   40% correct  15 training examples per category  Min Kernel (IK) SVM   52% correct  Accuracy of Min Kernel vs Linear on Text classificaHon 

Now we can use Min Kernel for  DetecHon in Seconds Instead of Hours 

Direct Training  It is possible to directly train classifiers with the same structure as the approximaHon   without using support vectors at all.  The formulaHon is very similar to a linear classifier,   with different regularizaHon.  Can be trained efficiently using stochasHc (sub)gradient descent.  Linear  Piecewise  w ′ w + c � ξ j w + c � ξ j Linear  minimize : minimize : w ′ H ˆ ˆ y i ( w ′ x j + b ) x j + b ) ≥ 1 − ξ j ≥ 1 − ξ j y i ( ˆ subject to : subject to : ˆ w ′ ˆ ξ j ξ j ≥ 0 ≥ 0 [ ]  1  ‐1    ‐1   2   ‐1  H =         ‐1    .                   2  ‐1                   ‐1   1 

Slightly different formulaHon  Linear  2 w ′ w + 1 λ � min ℓ ( w ; ( x i , y i )) w m i Piecewise linear  w + 1 λ � w ′ H ˆ min 2 ˆ ℓ ( ˆ w ; (ˆ x i , y i )) m w i

Shalev‐Schwartz, Singer, Srebro ICML 2007  � d � O for        accuracy  ǫ λǫ

Shalev‐Schwartz, Singer, Srebro ICML 2007  � d � √ � w � w ′ Hw O for        accuracy  ǫ λǫ � w ′ 1 Hw 1 (1 − η t λ H ) � w ′ 2 Hw t + 1 t + 1 2 Maji, Berg, ICCV 2009 

FeaturesforComputerVision AlexBerg ComputerScienceDepartment - PowerPoint PPT Presentation

FeaturesforComputerVision AlexBerg ComputerScienceDepartment ColumbiaUniversity WhyVision? Light! Itishowweseeotherpeople, navigateourenvironment,

Computer Vision Computer Vision How does vision work? What is vision for? Ela Claridge

COMPANY PROFILE WATER FEATURES 1 WATER FEATURES 2 WATER FEATURES 3 WATER FEATURES 4 WATER

CS 4495 Computer Vision Features 2 SIFT descriptor Aaron Bobick School of Interactive

CS 4495 Computer Vision Features 1 Harris and other corners Aaron Bobick School of

CS262: Computer Vision (and Human-Computer Interaction) John Magee 1 Computer Vision How are

Branding Presentation VISION Mevushal VISION Muscat of Alexandria & Viognier VISION

Vision Services Vision Services & & Vision Therapy Vision Therapy February 2, 2007

Vision Our National Church partners .. Vision Our National Network partners Vision Getting

CS 4495 Computer Vision Features 1 Harris and other corners Aaron Bobick School of

BLOGGING How to blog well FEATURES OF A BLOG... FEATURES OF A BLOG... Chronological

Computer Vision Introduction Historical context Connections to other disciplines Vision and

HIM Without Walls Realizing Our Vision! Realizing Our Vision Realize Our Vision Realizing Our

Computer Vision Neurobio 230 Bill Lotter Exciting time: Neuroscience computer vision

Deep Learning in Computer Vision Caner Hazrba Deep Learning in Action 24. June 15

J J R R Our Vision . . . Our Vision . . . Our Vision . . . Our Vision . . . TO BE THE BEST

Post- -trauma vision trauma vision Post Post- -trauma vision trauma vision Post syndrome

MIGS: Alcon Laboratories S Maybe Its Glaucoma Surgery! Allergan C,S New World Medical S

Country of Origin Labeling Recordkeeping Procedures 2016 COOL Retail Review Training 1 Overview

Cryptographic Foundations of Network Security -- Contemporary Tales of Use & Misuse Guevara

Numerical Issues and Influences in the Design of Algebraic Modeling Languages for Optimization

Think Somatics Presented by: Meredith Neuman Sheila Ojeda Casey Rossomondo Somerville Academy

START Surprise! MARTY Aaaagh! Alex, dont interrupt me when Im daydreaming. When the

Behavioral Analysis Using Network traffic, DNS and logs JOSH PYORRE Security Researcher

Semantic Segmentation of the sekleton in bone scintigraphy images with convolutional neural

FeaturesforComputerVision AlexBerg ComputerScienceDepartment - PowerPoint PPT Presentation

FeaturesforComputerVision AlexBerg ComputerScienceDepartment ColumbiaUniversity WhyVision? Light! Itishowweseeotherpeople, navigateourenvironment,

Computer Vision Computer Vision How does vision work? What is vision for? Ela Claridge

COMPANY PROFILE WATER FEATURES 1 WATER FEATURES 2 WATER FEATURES 3 WATER FEATURES 4 WATER

CS 4495 Computer Vision Features 2 SIFT descriptor Aaron Bobick School of Interactive

CS 4495 Computer Vision Features 1 Harris and other corners Aaron Bobick School of

CS262: Computer Vision (and Human-Computer Interaction) John Magee 1 Computer Vision How are

Branding Presentation VISION Mevushal VISION Muscat of Alexandria &amp; Viognier VISION

Vision Services Vision Services &amp; &amp; Vision Therapy Vision Therapy February 2, 2007

Vision Our National Church partners .. Vision Our National Network partners Vision Getting

CS 4495 Computer Vision Features 1 Harris and other corners Aaron Bobick School of

BLOGGING How to blog well FEATURES OF A BLOG... FEATURES OF A BLOG... Chronological

Computer Vision Introduction Historical context Connections to other disciplines Vision and

HIM Without Walls Realizing Our Vision! Realizing Our Vision Realize Our Vision Realizing Our

Computer Vision Neurobio 230 Bill Lotter Exciting time: Neuroscience computer vision

Deep Learning in Computer Vision Caner Hazrba Deep Learning in Action 24. June 15

J J R R Our Vision . . . Our Vision . . . Our Vision . . . Our Vision . . . TO BE THE BEST

Post- -trauma vision trauma vision Post Post- -trauma vision trauma vision Post syndrome

MIGS: Alcon Laboratories S Maybe Its Glaucoma Surgery! Allergan C,S New World Medical S

Country of Origin Labeling Recordkeeping Procedures 2016 COOL Retail Review Training 1 Overview

Cryptographic Foundations of Network Security -- Contemporary Tales of Use &amp; Misuse Guevara

Numerical Issues and Influences in the Design of Algebraic Modeling Languages for Optimization

Think Somatics Presented by: Meredith Neuman Sheila Ojeda Casey Rossomondo Somerville Academy

START Surprise! MARTY Aaaagh! Alex, dont interrupt me when Im daydreaming. When the

Behavioral Analysis Using Network traffic, DNS and logs JOSH PYORRE Security Researcher

Semantic Segmentation of the sekleton in bone scintigraphy images with convolutional neural

Branding Presentation VISION Mevushal VISION Muscat of Alexandria & Viognier VISION

Vision Services Vision Services & & Vision Therapy Vision Therapy February 2, 2007

Cryptographic Foundations of Network Security -- Contemporary Tales of Use & Misuse Guevara