Indexing with local features, Bag of words models Thursday, Oct 29 - PDF document

10/29/2009 Indexing with local features, Bag of words models Thursday, Oct 29 Kristen Grauman UT-Austin Last time • Interest point detection – Harris corner detector – Laplacian of Gaussian, automatic scale selection 1

10/29/2009 Local features: main components 1) Detection: Identify the interest points 2) Description :Extract vector feature descriptor surrounding each interest point. 3) Matching: Determine correspondence between descriptors in two views Corners as distinctive interest points ⎡ ⎤ I I I I = ∑ x x x y ⎢ ⎥ M w ( x , y ) I I I I ⎣ ⎣ ⎦ ⎦ x x y y y y y y 2 x 2 matrix of image derivatives (averaged in neighborhood of a point). ∂ ∂ ∂ ∂ I I I I ⇔ ⇔ ⇔ Notation: I x I y I I ∂ ∂ ∂ ∂ x y x y x y 2

10/29/2009 Harris corners example Any local max in 3 x 3 window A l l i 3 3 i d O l l Only local maxes exceeding l di from the R map average R (thresholded) Properties of the Harris corner detector Rotation invariant? Yes Scale invariant? No All points will be Corner ! classified as edges 3

10/29/2009 Automatic scale selection We define the characteristic scale as the scale that produces peak of Laplacian response characteristic scale Slide credit: Lana Lazebnik Example Original image at ¾ the size 4

10/29/2009 Original image at ¾ the size 5

10/29/2009 6

10/29/2009 7

10/29/2009 Scale invariant interest points Interest points are local maxima in both position and scale. σ5 σ4 scale σ + σ ( ) ( ) L L σ3 xx yy σ2 ⇒ List of (x, y, σ ) σ1 Squared filter response maps Today • Matching local features • Indexing features • Bag of words model 8

10/29/2009 Local features: main components 1) Detection: Identify the interest points 2) Description :Extract vector = ( 1 ) ( 1 ) K [ x , , x ] x feature descriptor 1 1 d surrounding each interest point. = ( 2 ) ( 2 ) K [ x , , x ] x 2 1 d 3) Matching: Determine correspondence between descriptors in two views Raw patches as local descriptors The simplest way to describe the neighborhood around an interest neighborhood around an interest point is to write down the list of intensities to form a feature vector. But this is very sensitive to even small shifts, rotations. 9

10/29/2009 SIFT descriptor [Lowe 2004] • Use histograms to bin pixels within sub-patches according to their orientation. 2 π 0 Why subpatches? Why does SIFT have some illumination invariance? Making the descriptor rotation invariant CSE 576: Computer Vision • Rotate patch according to its dominant gradient orientation • This puts the patches into a canonical orientation. Image from Matthew Brown 10

10/29/2009 SIFT descriptor [Lowe 2004] Extraordinarily robust matching technique • Can handle changes in viewpoint • • Up to about 60 degree out of plane rotation Can handle significant changes in illumination • • Sometimes even day vs. night (below) Fast and efficient—can run in real time • Lots of code available • • http://people.csail.mit.edu/albert/ladypack/wiki/index.php/Known_implementations_of_SIFT Steve Seitz Local features: main components 1) Detection: Identify the interest points 2) Description :Extract vector feature descriptor surrounding each interest point. 3) Matching: Determine correspondence between descriptors in two views 11

10/29/2009 Matching local features Matching local features ? Image 1 Image 1 Image 2 Image 2 To generate candidate matches , find patches that have the most similar appearance (e.g., lowest SSD) Simplest approach: compare them all, take the closest (or closest k, or within a thresholded distance) 12

10/29/2009 Matching local features Image 1 Image 1 Image 2 Image 2 In stereo case, may constrain by proximity if we make assumptions on max disparities. Ambiguous matches ? ? ? ? Image 1 Image 1 Image 2 Image 2 At what SSD value do we have a good match? To add robustness to matching, can consider ratio : distance to best match / distance to second best match If high, could be ambiguous match. 13

10/29/2009 Applications of local invariant features • Wide baseline stereo • Motion tracking Motion tracking • Panoramas • Mobile robot navigation • 3D reconstruction • Recognition • … Automatic mosaicing http://www.cs.ubc.ca/~mbrown/autostitch/autostitch.html 14

10/29/2009 Wide baseline stereo [Image from T. Tuytelaars ECCV 2006 tutorial] Recognition Sivic and Zisserman, 2003 Schmid and Mohr 1997 Lowe 2002 Rothganger et al. 2003 15

10/29/2009 Today • Matching local features • Indexing features • Bag of words model Indexing local features • Each patch / region has a descriptor, which is a point in some high-dmensional feature space ( (e.g., SIFT) SIFT) 16

10/29/2009 Indexing local features • When we see close points in feature space, we have similar descriptors, which indicates similar local content local content. • This is of interest not only for 3d reconstruction, but also for retrieving images of similar objects. Figure credit: A. Zisserman Indexing local features … 17

10/29/2009 Indexing local features • With potentially thousands of features per image, and hundreds to millions of images to g , g search, how to efficiently find those that are relevant to a new image? Indexing local features: inverted file index • For text documents, an efficient way to find ffi i t t fi d all pages on which a word occurs is to use an index… • We want to find all images in which a feature occurs. • To use this idea, we’ll need to map our features to “visual words”. 18

10/29/2009 Text retrieval vs. image search • What makes the problems similar, different? Visual words: main idea • Extract some local features from a number of images … e.g., S IFT descriptor space: each point is 128-dimensional S lide credit: D. Nister, CVPR 2006 19

10/29/2009 Visual words: main idea Visual words: main idea 20

10/29/2009 Visual words: main idea Each point is a local descriptor, e.g. SIFT vector. 21

10/29/2009 Visual words Map high-dimensional descriptors to tokens/words by quantizing the feature space • Quantize via Q clustering, let cluster centers be the prototype “ words” Descriptor space Descriptor space 22

10/29/2009 Visual words Map high-dimensional descriptors to tokens/words by quantizing the feature space • Determine which word to assign to each new image region by finding the closest cluster center. Descriptor space Descriptor space Visual words • Example: each group of patches belongs to the g same visual word Figure from S ivic & Zisserman, ICCV 2003 23

10/29/2009 Visual words and textons • First explored for texture and material representations • Texton = cluster center of filter responses over collection of images • Describe textures and materials based on distribution of prototypical texture elements. texture elements Leung & Malik 1999; Varma & Zisserman, 2002; Lazebnik, S chmid & Ponce, 2003; Recall: Texture representation example Windows with primarily horizontal Both edges value) mean mean mension 2 (mean d/dy d/dx d/dy value value Win. #1 4 10 Win.#2 18 7 … Win.#9 20 20 Dim Dimension 1 (mean d/dx value) … Windows with Windows with small gradient in primarily vertical statistics to both directions edges summarize patterns in small windows 24

10/29/2009 Visual words • More recently used for describing scenes and objects for the sake of objects for the sake of indexing or classification. Sivic & Zisserman 2003; Csurka, Bray, Dance, & Fan 2004; many others. Inverted file index • Database images are loaded into the index mapping words to image numbers 25

10/29/2009 Inverted file index • New query image is mapped to indices of database images that share a word. • If a local image region is a visual word, h how can we summarize an image (the i i (th document)? 26

Indexing with local features, Bag of words models Thursday, Oct 29 - PDF document

10/29/2009 Indexing with local features, Bag of words models Thursday, Oct 29 Kristen Grauman UT-Austin Last time Interest point detection Harris corner detector Laplacian of Gaussian, automatic scale selection 1 10/29/2009

Efficient visual search of local features Cordelia Schmid Bag-of-features

Outline Last time: local invariant features, scale invariant detection Lecture 14:

Bag of Words Model Overview of todays lecture Bag-of-words. K-means clustering.

WINE BOTTLE AIRBAG SINGLE WINE BOTTLE AIRBAG SINGLE BOTTLE AIR BAG PROTECT ALL BOTTLED PRODUCT

Red-Bag Engineers Consultants Software User Day April 2017 Red-Bag 2017 1 Ves Online

Pathway Red Bag Scheme October 2018 The Red Bag concept The Red Bag scheme was first implemented

The Plastic Bag Free world in action Surfriders Ban the Bag Campaign Plastic bag free

Kristen Grauman Kristen Grauman CS 376 Lecture 18 1 3/30/2011 Indexing local features

Bag-of-Visual-Words 16-385 Computer Vision (Kris Kitani) Carnegie Mellon University What object

Distributed Indexing Indexing, session 8 CS6200: Information Retrieval Slides by: Jesse Anderton

Indexing Multimedia Multimedia Databases Databases Indexing Indexing Multimedia Databases

COMPANY PROFILE WATER FEATURES 1 WATER FEATURES 2 WATER FEATURES 3 WATER FEATURES 4 WATER

Lecture: Visual Bag of Words Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning

Text Representation Bag-of-Words and Word Embeddings count vector unordered bag over

DC Bag Law Presented by Jeffrey Seltzer Associate Director Stormwater Management Division District

Audio Indexing and Retrieval IT6902; Semester B, 2004/2005; Leung Audio Indexing and Retrieval

Database / Data Mining Visualization DataJewel: Tightly Integrating Visualization with Temporal

Make the most of your time at the Meet the Buyer - 7 June 2017 31 May 2017 Luke Hampton Loreta

NPTEL VIDEO COURSES (527) IN SUPPLEMENTARY FORMATS PDF Slides of MP4, Audio Lectures (MP3),

Design and construction of an underwater robot Deivid Pugal Supervisors: Alvo Aabloo and Maarja

CS103 Unit 6 - Pointers Mark Redekopp 2 Why Pointers Scenario: You write a paper and

C Programming for Engineers Pointers ICEN 360 Spring 2017 Prof. Dola Saha 1 Pointers

CS 161 Intro to CS I Pointers 1 Introduc2on

The Pointer Assertion Logic Engine [PLDI 01] Anders M ller Michael I. Schwartzbach