bag of features for category classification for category
play

Bag-of-features for category classification for category - PowerPoint PPT Presentation

Bag-of-features for category classification for category classification Cordelia Schmid Category recognition Category recognition Image classification: assigning a class label to the image Image classification: assigning a class label to


  1. Bag-of-features for category classification for category classification Cordelia Schmid

  2. Category recognition Category recognition • Image classification: assigning a class label to the image Image classification: assigning a class label to the image Car: present Cow: present Bike: not present Horse: not present Horse: not present …

  3. Category recognition Category recognition Tasks Tasks • Image classification: assigning a class label to the image Image classification: assigning a class label to the image Car: present Cow: present Bike: not present Horse: not present Horse: not present … • Object localization: define the location and the category Object localization: define the location and the category L Location ti Car Cow Category Category

  4. Category recognition Category recognition • Robust image description • Robust image description – Appropriate descriptors for categories • Statistical modeling and machine learning for vision • Statistical modeling and machine learning for vision – Use and validation of appropriate techniques

  5. Why machine learning? Why machine learning? • Early approaches: simple features + handcrafted models Early approaches: simple features handcrafted models • Can handle only few images, simples tasks L. G. Roberts, Machine Perception of Three Dimensional Solids, , p f , Ph.D. thesis, MIT Department of Electrical Engineering, 1963.

  6. Why machine learning? Why machine learning? • Early approaches: manual programming of rules Early approaches: manual programming of rules • Tedious, limited and does not take into accout the data Y. Ohta, T. Kanade, and T. Sakai, “ An Analysis System for Scenes Containing objects with Substructures,” International Joint Conference on Pattern Recognition , 1978.

  7. Why machine learning? Why machine learning? • Today lots of data, complex tasks Today lots of data complex tasks Internet images, Movies, news, sports personal photo albums • Instead of trying to encode rules directly, learn them f from examples of inputs and desired outputs l f i d d i d

  8. Types of learning problems Types of learning problems • Supervised Supervised – Classification – Regression Regression • Unsupervised • Semi-supervised Semi supervised • Active learning • ….

  9. Supervised learning Supervised learning • Given training examples of inputs and corresponding Given training examples of inputs and corresponding outputs, produce the “correct” outputs for new inputs • Two main scenarios: – Classification: outputs are discrete variables (category labels). Learn a decision boundary that separates one class from the other. – Regression: also known as “curve fitting” or “function approximation.” Learn a continuous input-output mapping from pp p p pp g examples (possibly noisy).

  10. Unsupervised Learning Unsupervised Learning • Given only unlabeled data as input, learn some sort of Given only unlabeled data as input learn some sort of structure. • The objective is often more vague or subjective than in supervised learning This is more an exploratory/descriptive supervised learning. This is more an exploratory/descriptive data analysis.

  11. Unsupervised Learning Unsupervised Learning • Clustering Clustering – Discover groups of “similar” data points

  12. Unsupervised Learning Unsupervised Learning • Quantization Quantization – Map a continuous input to a discrete (more compact) output 2 1 1 3

  13. Unsupervised Learning Unsupervised Learning • Dimensionality reduction, manifold learning Dimensionality reduction manifold learning – Discover a lower-dimensional surface on which the data lives

  14. Other types of learning Other types of learning • Semi-supervised learning: lots of data is available, but Semi supervised learning: lots of data is available but only small portion is labeled (e.g. since labeling is expensive) expensive)

  15. Other types of learning Other types of learning • Semi-supervised learning: lots of data is available, but Semi supervised learning: lots of data is available but only small portion is labeled (e.g. since labeling is expensive) expensive) – Why is learning from labeled and unlabeled data better than learning from labeled data alone? ?

  16. Other types of learning Other types of learning • Active learning: the learning algorithm can choose its Active learning: the learning algorithm can choose its own training examples, or ask a “teacher” for an answer on selected inputs on selected inputs

  17. Category recognition Category recognition • Image classification: assigning a class label to the image Image classification: assigning a class label to the image Car: present Cow: present Bike: not present Bike: not present Horse: not present … • Supervised scenario: given a set of training images Supervised scenario: given a set of training images

  18. Image classification Image classification • Given Given Positive training images containing an object class Negative training images that don’t • Classify Classify A test image as to whether it contains the object class or not ?

  19. Bag-of-features for image classification Bag of features for image classification • Origin: texture recognition Origin: texture recognition • Texture is characterized by the repetition of basic elements or textons Julesz, 1981; Cula & Dana, 2001; Leung & Malik 2001; Mori, Belongie & Malik, 2001; Schmid 2001; Varma & Zisserman, 2002, 2003; Lazebnik, Schmid & Ponce, 2003

  20. Texture recognition g histogram Universal texton dictionary Julesz, 1981; Cula & Dana, 2001; Leung & Malik 2001; Mori, Belongie & Malik, 2001; Schmid 2001; Varma & Zisserman, 2002, 2003; Lazebnik, Schmid & Ponce, 2003

  21. Bag-of-features – Origin: bag-of-words (text) Bag of features Origin: bag of words (text) • Orderless document representation: frequencies of words Orderless document representation: frequencies of words from a dictionary • Classification to determine document categories Classification to determine document categories Bag-of-words Common Co o 2 0 0 1 3 3 People 3 0 0 2 Sculpture 0 1 3 0 … … … … …

  22. Bag-of-features for image classification Bag of features for image classification SVM SVM Extract regions Extract regions Compute Compute Find clusters Find clusters Compute distance Compute distance Classification Classification descriptors and frequencies matrix [Csurka et al. WS’2004], [Nowak et al. ECCV’06], [Zhang et al. IJCV’07]

  23. Bag-of-features for image classification Bag of features for image classification SVM SVM Extract regions Extract regions Compute Compute Find clusters Find clusters Compute distance Compute distance Classification Classification descriptors and frequencies matrix Step 1 Step 3 Step 2

  24. Step 1: feature extraction Step 1: feature extraction • Scale-invariant image regions + SIFT (see lecture 2) Scale invariant image regions + SIFT (see lecture 2) – Affine invariant regions give “too” much invariance – Rotation invariance for many realistic collections “too” much Rotation invariance for many realistic collections too much invariance • Dense descriptors – Improve results in the context of categories (for most categories) – Interest points do not necessarily capture “all” features I t t i t d t il t “ ll” f t • Color based descriptors • Color-based descriptors • Shape based descriptors • Shape-based descriptors

  25. Dense features Dense features - Multi-scale dense grid: extraction of small overlapping patches at multiple scales - Computation of the SIFT descriptor for each grid cells Computation of the SIFT descriptor for each grid cells - Exp.: Horizontal/vertical step size 3-6 pixel, scaling factor of 1.2 per level

  26. Bag-of-features for image classification Bag of features for image classification SVM SVM Extract regions Extract regions Compute Compute Find clusters Find clusters Compute distance Classification Compute distance Classification descriptors and frequencies matrix Step 1 Step 3 Step 2

  27. Step 2: Quantization Step 2: Quantization …

  28. Step 2:Quantization Clustering

  29. Step 2: Quantization Visual vocabulary Visual vocabulary Clustering Clustering

  30. Examples for visual words p Airplanes Airplanes Motorbikes Faces Wild Cats Leaves People Bikes

  31. Hard or soft assignment Hard or soft assignment • K-means  hard assignment K means  hard assignment – Assign to the closest cluster center – Count number of descriptors assigned to a center Count number of descriptors assigned to a center • Gaussian mixture model  soft assignment g – Estimate distance to all centers – Sum over number of descriptors • Represent image by a frequency histogram

  32. Image representation Image representation cy requenc fr ….. codewords • each image is represented by a vector h i i t d b t • typically 1000-4000 dimension • fine grained – represent model instances fine grained represent model instances • coarse grained – represent object categories

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend