Bag-of-features models for category classification for category - PowerPoint PPT Presentation

Bag-of-features models for category classification for category classification Cordelia Schmid

Category recognition Category recognition • Image classification: assigning a class label to the image Image classification: assigning a class label to the image Car: present Cow: present Bike: not present Horse: not present Horse: not present …

Category recognition Category recognition Tasks Tasks • Image classification: assigning a class label to the image Image classification: assigning a class label to the image Car: present Cow: present Bike: not present Horse: not present Horse: not present … • Object localization: define the location and the category Object localization: define the location and the category L Location ti Car Cow Category Category

Difficulties: within object variations Difficulties: within object variations Variability : Camera position, Illumination,Internal parameters Within-object variations

Difficulties: within class variations Difficulties: within class variations

Image classification Image classification • Given Given Positive training images containing an object class Negative training images that don’t • Classify Classify A test image as to whether it contains the object class or not ?

Bag-of-features – Origin: texture recognition Bag of features Origin: texture recognition • Texture is characterized by the repetition of basic elements Texture is characterized by the repetition of basic elements or textons Julesz, 1981; Cula & Dana, 2001; Leung & Malik 2001; Mori, Belongie & Malik, 2001; Schmid 2001; Varma & Zisserman, 2002, 2003; Lazebnik, Schmid & Ponce, 2003

Bag-of-features – Origin: texture recognition Bag of features Origin: texture recognition histogram Universal texton dictionary Julesz, 1981; Cula & Dana, 2001; Leung & Malik 2001; Mori, Belongie & Malik, 2001; Schmid 2001; Varma & Zisserman, 2002, 2003; Lazebnik, Schmid & Ponce, 2003

Bag-of-features – Origin: bag-of-words (text) Bag of features Origin: bag of words (text) • Orderless document representation: frequencies of words Orderless document representation: frequencies of words from a dictionary • Classification to determine document categories Classification to determine document categories Bag-of-words Common Co o 2 0 0 1 3 3 People 3 0 0 2 Sculpture 0 1 3 0 … … … … …

Bag-of-features for image classification Bag of features for image classification SVM SVM Extract regions Extract regions Compute Compute Find clusters Find clusters Compute distance Compute distance Classification Classification descriptors and frequencies matrix [Csurka et al., ECCV Workshop’04], [Nowak,Jurie&Triggs,ECCV’06], [Zhang,Marszalek,Lazebnik&Schmid,IJCV’07]

Bag-of-features for image classification Bag of features for image classification SVM SVM Extract regions Extract regions Compute Compute Find clusters Find clusters Compute distance Compute distance Classification Classification descriptors and frequencies matrix Step 1 Step 3 Step 2

Step 1: feature extraction Step 1: feature extraction • Scale-invariant image regions + SIFT (see previous lecture) Scale invariant image regions + SIFT (see previous lecture) – Affine invariant regions give “too” much invariance – Rotation invariance for many realistic collections “too” much Rotation invariance for many realistic collections too much invariance • Dense descriptors – Improve results in the context of categories (for most categories) – Interest points do not necessarily capture “all” features I t t i t d t il t “ ll” f t • Color based descriptors • Color-based descriptors • Shape based descriptors • Shape-based descriptors

Dense features Dense features - Multi-scale dense grid: extraction of small overlapping patches at multiple scales -Computation of the SIFT descriptor for each grid cells Computation of the SIFT descriptor for each grid cells -Exp.: Horizontal/vertical step size 3 pixel, scaling factor of 1.2 per level

Bag-of-features for image classification Bag of features for image classification SVM SVM Extract regions Extract regions Compute Compute Find clusters Find clusters Compute distance Classification Compute distance Classification descriptors and frequencies matrix Step 1 Step 3 Step 2

Step 2: Quantization Visual vocabulary Visual vocabulary Clustering Clustering

Examples for visual words p Airplanes Airplanes Motorbikes Faces Wild Cats Leaves People Bikes

Step 2: Quantization Step 2: Quantization • Cluster descriptors Cluster descriptors – K-means – Gaussian mixture model Gaussian mixture model • Assign each visual word to a cluster g – Hard or soft assignment • Build frequency histogram

K-means clustering K means clustering • Minimizing sum of squared Euclidean distances g q between points x i and their nearest cluster centers • Algorithm: – Randomly initialize K cluster centers y – Iterate until convergence: • Assign each data point to the nearest center • Recompute each cluster center as the mean of all points R t h l t t th f ll i t assigned to it • Local minimum, solution dependent on initialization • Initialization important, run several times, select best

Gaussian mixture model (GMM) Gaussian mixture model (GMM) • Mixture of Gaussians: weighted sum of Gaussians • Mixture of Gaussians: weighted sum of Gaussians where e e

Hard or soft assignment Hard or soft assignment • K-means  hard assignment K means  hard assignment – Assign to the closest cluster center – Count number of descriptors assigned to a center Count number of descriptors assigned to a center • Gaussian mixture model  soft assignment g – Estimate distance to all centers – Sum over number of descriptors • Represent image by a frequency histogram

Image representation Image representation cy requenc fr ….. codewords • each image is represented by a vector, typically 1000-4000 dimension, normalization with L1/L2 norm • fine grained – represent model instances fine grained represent model instances • coarse grained – represent object categories

Bag-of-features for image classification Bag of features for image classification SVM SVM Extract regions Extract regions Compute Compute Find clusters Find clusters Compute distance Classification Compute distance Classification descriptors and frequencies matrix Step 1 Step 2 Step 3

Step 3: Classification • Learn a decision rule (classifier) assigning bag-of- Learn a decision rule (classifier) assigning bag of features representations of images to different classes Decision Zebra boundary Non-zebra

Training data Training data Vectors are histograms, one from each training image positive negative Train classifier,e.g.SVM

Linear classifiers Linear classifiers • Find linear function ( hyperplane ) to separate positive and negative examples i l      x x positive positive : : x x w w 0 0 b b i i i i    x negative : x w 0 b i i Which hyperplane is best?

Linear classifiers - margin Linear classifiers margin x 2 x x x 2 2 2 (color) (color) • Generalization is not G li ti i t good in this case: (roundness (roundness ) ) x x 1 1 x 2 x x x 2 2 2 (color) (color) • Better if a margin is introduced: b/| | w (roundness (roundness ) ) x x 1 1

Nonlinear SVMs Nonlinear SVMs • Datasets that are linearly separable work out great: x 0 • But what if the dataset is just too hard? x 0 • We can map it to a higher-dimensional space: We can map it to a higher dimensional space: x 2 0 x

Nonlinear SVMs Nonlinear SVMs • General idea: the original input space can always be General idea: the original input space can always be mapped to some higher-dimensional feature space where the training set is separable: Φ : x → φ ( x )

Nonlinear SVMs Nonlinear SVMs • The kernel trick : instead of explicitly computing the lifting transformation φ ( x ), define a kernel function K such that K ( x i , x j j ) = φ ( x i ) · φ ( x j ) • This gives a nonlinear decision boundary in the original feature space: eatu e space     ( x , x ) y K b i i i i

Kernels for bags of features Kernels for bags of features N   • Histogram intersection kernel: ( , ) min( ( ), ( )) I h h h i h i 1 2 1 2  1 i • Generalized Gaussian kernel:   1   2     ( ( , ) ) exp exp ( ( , ) ) K K h h h h D D h h h h 1 2 1 2   A • D can be Euclidean distance  RBF kernel • D can be Euclidean distance  RBF kernel     2  ( ( ) ) ( ( ) ) N h i h i   • D can be χ 2 distance 2 distance  ( ( , ) ) 1 1 2 2 D can be D D h h h h 1 2  ( ) ( ) h i h i  1 1 2 i

Bag-of-features models for category classification for category - PowerPoint PPT Presentation

Bag-of-features models for category classification for category classification Cordelia Schmid Category recognition Category recognition Image classification: assigning a class label to the image Image classification: assigning a class

Bag-of-features for category classification Cordelia Schmid Category recognition Image

Bag-of-features for category classification Cordelia Schmid Category recognition Image

Bag-of-features for category classification Cordelia Schmid Category recognition Image

Bag-of-features for category classification Cordelia Schmid Category recognition Image

Bag-of-features for category classification for category classification Cordelia Schmid

Classification with generative models 2 DSE 210 Classification with parametrized models

Estimating Gaussian Mixture Models from Data with Missing Features by Daniel McMichael CSSIP

Generative and discriminative classification techniques Machine Learning and Category

Classical vs prototype model of categorization Classical model Category membership

Generative and discriminative classification techniques Machine Learning and Category

Music Classification Using Constant-Q Based Features a library for mobile devices Lena Brder

Category-level localization Cordelia Schmid Category-level localization Localization of

Algorithms for NLP Classification I Sachin Kumar - CMU Slides: Dan Klein UC Berkeley, Taylor

Algorithms for NLP Classification Sachin Kumar Slides: Dan Klein UC Berkeley, Taylor

Music Classification Overview and Audio Features Graduate School of Culture Technology, KAIST

Cue validity Cue validity - predictiveness of a cue for a given category Central

E-M method for latent variable models Define augmented likelihood n k R ij ln p ( x i , y i =

C u stomer and prod u ct segmentation basics MAC H IN E L E AR N IN G FOR MAR K E TIN G IN P

Joint Parameter Estimation of the Ornstein-Uhlenbeck SDE driven by Fractional Brownian Motion

Expectation maximization Subhransu Maji CMPSCI 689: Machine Learning 14 April 2015 Motivation

tabula rasa Exploring sound/gesture typo-morphology for enactive computer music performance

trrt tt s

Histogram-based matching of GMM encoded features for online signature verification Vivek

Compressive Classification (Machine Learning without learning) Vincent Schellekens Laurent