Instance level recognition IV: Very large databases Cordelia Schmid - PowerPoint PPT Presentation

Instance level recognition IV: Very large databases Cordelia Schmid LEAR – INRIA Grenoble

Visual search … change in viewing angle

Matches 22 correct matches

Image search system for large datasets Large image dataset (one million images or more) query ranked image list Image search system • Issues for very large databases • to reduce the query time • to reduce the storage requirements • with minimal loss in retrieval accuracy

Large scale object/scene recognition Image dataset: > 1 million images query ranked image list Image search system • Each image described by approximately 2000 descriptors – 2 * 10 9 descriptors to index for one million images! • Database representation in RAM: – Size of descriptors : 1 TB, search+memory intractable

Bag-of-features [Sivic&Zisserman’03] Query Set of SIFT centroids image descriptors (visual words) sparse frequency vector Harris-Hessian-Laplace Bag-of-features regions + SIFT descriptors processing + tf-idf weighting • Visual Words • Visual Words – 1 word (index) per local descriptor Inverted – only images ids in inverted file querying file ⇒ 8 GB for a million images, fits in RAM • Problem – Matching approximation Re-ranked ranked image Geometric list short-list verification [Chum & al. 2007]

Visual words – approximate NN search • Map descriptors to words by quantizing the feature space – Quantize via k-means clustering to obtain visual words – Assign descriptors to closest visual words • Bag-of-features as approximate nearest neighbor search • Bag-of-features as approximate nearest neighbor search Descriptor matching with k -nearest neighbors Bag-of-features matching function where q(x) is a quantizer, i.e., assignment to a visual word and δ a,b is the Kronecker operator ( δ a,b =1 iff a=b)

Approximate nearest neighbor search evaluation •ANN algorithms usually returns a short-list of nearest neighbors – this short-list is supposed to contain the NN with high probability – exact search may be performed to re-order this short-list •Proposed quality evaluation of ANN search: trade-off between – Accuracy : NN recall = probability that the NN is in this list against against – Ambiguity removal = proportion of vectors in the short-list - the lower this proportion, the more information we have about the vector - the lower this proportion, the lower the complexity if we perform exact search on the short-list •ANN search algorithms usually have some parameters to handle this trade-off

ANN evaluation of bag-of-features •ANN algorithms returns a list of 0.7 potential neighbors k=100 0.6 200 • Accuracy : NN recall 500 0.5 = probability that the 1000 recall NN is in this list NN is in this list NN rec 0.4 0.4 2000 2000 5000 0.3 • Ambiguity removal : 10000 20000 = proportion of vectors 30000 50000 0.2 in the short-list 0.1 •In BOF, this trade-off BOW is managed by the 0 number of clusters k 1e-07 1e-06 1e-05 0.0001 0.001 0.01 0.1 rate of points retrieved

Vocabulary size • The intrinsic matching scheme performed by BOF is weak – for a “small” visual dictionary: too many false matches – for a “large” visual dictionary: complexity, true matches are missed • • No good trade-off between “small” and “large” ! No good trade-off between “small” and “large” ! – either the Voronoi cells are too big – or these cells can’t absorb the descriptor noise → intrinsic approximate nearest neighbor search of BOF is not sufficient

20K visual word: false matches

200K visual word: good matches missed

Hamming Embedding [Jegou et al. ECCV’08] Representation of a descriptor x – Vector-quantized to q(x) as in standard BOF + short binary vector b(x) for an additional localization in the Voronoi cell Two descriptors x and y match iif h( a , b ) Hamming distance

Term frequency – inverse document frequency • Weighting with tf-idf score: weight visual words based on their frequency •Tf: normalized term (word) frequency ti in a document dj ∑ ∑ = = tf tf n n / / n n ij ij ij ij kj kj k •Idf: inverse document frequency, total number of documents divided by number of documents containing the term ti D = idf log { } ∈ i d : t d i − = ⋅ Tf-Idf: tf idf tf idf ij ij i ��

Hamming Embedding [Jegou et al. ECCV’08] •Nearest neighbors for Hamming distance ≈ those for Euclidean distance → a metric in the embedded space reduces dimensionality curse effects •Efficiency – Hamming distance = very few operations – Fewer random memory accesses: 3 x faster that BOF with same dictionary size!

Hamming Embedding • Off-line (given a quantizer) – draw an orthogonal projection matrix P of size d b × d → this defines d b random projection directions – for each Voronoi cell and projection direction, compute the median – for each Voronoi cell and projection direction, compute the median value for a learning set • On-line : compute the binary signature b(x) of a given descriptor – project x onto the projection directions as z(x) = (z 1 ,…z db ) – b i (x) = 1 if z i (x) is above the learned median value, otherwise 0

Hamming neighborhood 1 0.8 ieved (recall) Trade-off between memory usage and accuracy rate of 5-NN retrieve 0.6 0.6 � More bits yield higher accuracy 0.4 In practice, 64 bits (8 byte) 0.2 8 bits 16 bits 32 bits 64 bits 128 bits 0 0 0.2 0.4 0.6 0.8 1 rate of cell points retrieved

ANN evaluation of Hamming Embedding 0.7 32 28 compared to BOW: at least k=100 24 0.6 22 10 times less points in the 200 short-list for the same level 20 500 0.5 of accuracy 1000 18 2000 2000 0.4 0.4 NN recall h t =16 5000 Hamming Embedding 0.3 10000 provides a much better 20000 trade-off between recall 30000 50000 0.2 and ambiguity removal 0.1 HE+BOW BOW 0 1e-08 1e-07 1e-06 1e-05 0.0001 0.001 0.01 0.1 rate of points retrieved

Matching points - 20k word vocabulary 240 matches 201 matches Many matches with the non-corresponding image!

Matching points - 200k word vocabulary 69 matches 35 matches Still many matches with the non-corresponding one

Matching points - 20k word vocabulary + HE 83 matches 8 matches 10x more matches with the corresponding image!

Bag-of-features [Sivic&Zisserman’03] Query Set of SIFT centroids image descriptors (visual words) sparse frequency vector Bag-of-features Harris-Hessian-Laplace processing regions + SIFT descriptors + tf-idf weighting Inverted querying file Re-ranked ranked image Geometric list short-list verification [Chum & al. 2007]

Geometric verification Use the position and shape of the underlying features to improve retrieval quality Both images have many matches – which is correct?

Geometric verification We can measure spatial consistency between the query and each result to improve retrieval quality Many spatially consistent Few spatially consistent matches – correct result matches – incorrect result

Geometric verification Gives localization of the object

Weak geometry consistency • Re-ranking based on full geometric verification – works very well – but performed on a short-list only (typically, 100 images) → for very large datasets, the number of distracting images is so high that relevant images are not even short-listed! 1 1 short-list size: short-list size: 0.9 20 images rate of relevant images short-listed 100 images 0.8 1000 images 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1000 10000 100000 1000000 dataset size

Weak geometry consistency • Weak geometric information used for all images (not only the short-list) • Each invariant interest region detection has a scale and rotation angle associated, here characteristic scale and dominant gradient orientation Scale change 2 Rotation angle ca. 20 degrees • Each matching pair results in a scale and angle difference • For the global image scale and rotation changes are roughly consistent

WGC: orientation consistency Max = rotation angle between images

WGC: scale consistency

Weak geometry consistency • Integration of the geometric verification into the BOF – votes for an image in two quantized subspaces, i.e. for angle & scale – these subspace are show to be roughly independent – final score: filtering for each parameter (angle and scale) • Only matches that do agree with the main difference of orientation and scale will be taken into account in the final score • Re-ranking using full geometric transformation still adds information in a final stage

INRIA holidays dataset • Evaluation for the INRIA holidays dataset, 1491 images – 500 query images + 991 annotated true positives – Most images are holiday photos of friends and family • 1 million & 10 million distractor images from Flickr • Vocabulary construction on a different Flickr set • Vocabulary construction on a different Flickr set • Almost real-time search speed • Evaluation metric: mean average precision (in [0,1], bigger = better) – Average over precision/recall curve

Holiday dataset – example queries

Instance level recognition IV: Very large databases Cordelia Schmid - PowerPoint PPT Presentation

Instance level recognition IV: Very large databases Cordelia Schmid LEAR INRIA Grenoble Visual search change in viewing angle Matches 22 correct matches Image search system for large datasets Large image dataset (one million

I Instance-level recognition t l l iti Cordelia Schmid INRIA Instance-level recognition

Instance-level recognition Cordelia Schmid INRIA, Grenoble Instance-level recognition Search

Instance-level recognition Cordelia Schmid INRIA, Grenoble Instance-level recognition Search

Instance recognition Thurs April 6 Kristen Grauman UT Austin Instance recognition Indexing

Instance-level Recognition Pingmei Xu Object Recognition Friends SE01EP02 Recognition: Find the

Instance-level recognition 1) Local invariant features 2) Matching and recognition with local

Divide And Conquer Small And Large Instance Small instance. Sort a list that has n <=

Divide And Conquer Small And Large Instance Small instance. Sort a list that has n <=

Instance-level recognition 1) Local invariant features 2) Matching and recognition with local

Instance-level recognition 1) Local invariant features 2) Matching and recognition with local

Creating Databases and Tables Introduction to Databases in Python Creating Databases

Inductive Inductive Inductive Inductive Databases Databases Databases Databases and

Lecture 11: Persistent Memory Databases 1 / 71 Persistent Memory Databases Recap

Module 3: Creating and Managing Databases Overview Creating Databases Creating

INSTANCE BASED LEARNING 2 Instance-Based Learning Distance function defines whats learned

Test Instance Generation Test Instance Generation for MAX 2SAT for MAX 2SAT Mitsuo Motoki

Weak gravity conjecture, Multiple point principle and SM landscape Yuta Hamada

Pair Production and Gravity as the Weakest Force 2005.07720 w/ L. E. Ibez Eduardo Gonzalo

Weak Gravity Conjecture from Black Hole Entropy Clifford Cheung CC, Liu, Remmen (1801.08546)

Riemann surfaces for KPZ fluctuations in finite volume Sylvain Prolhac Laboratoire de physique

Geothermal Energy Geothermal Energy for heating in Europe for heating in Europe - status and

Modelling subglacial drainage and its role in ice-ocean interaction Ian Hewitt (University of

(Pre-)Algebras for Linguistics 3. Trees Carl Pollard Linguistics 680: Formal Foundations

rt t r st r

Instance level recognition IV: Very large databases Cordelia Schmid - PowerPoint PPT Presentation

Instance level recognition IV: Very large databases Cordelia Schmid LEAR INRIA Grenoble Visual search change in viewing angle Matches 22 correct matches Image search system for large datasets Large image dataset (one million

I Instance-level recognition t l l iti Cordelia Schmid INRIA Instance-level recognition

Instance-level recognition Cordelia Schmid INRIA, Grenoble Instance-level recognition Search

Instance-level recognition Cordelia Schmid INRIA, Grenoble Instance-level recognition Search

Instance recognition Thurs April 6 Kristen Grauman UT Austin Instance recognition Indexing

Instance-level Recognition Pingmei Xu Object Recognition Friends SE01EP02 Recognition: Find the

Instance-level recognition 1) Local invariant features 2) Matching and recognition with local

Divide And Conquer Small And Large Instance Small instance. Sort a list that has n &lt;=

Divide And Conquer Small And Large Instance Small instance. Sort a list that has n &lt;=

Instance-level recognition 1) Local invariant features 2) Matching and recognition with local

Instance-level recognition 1) Local invariant features 2) Matching and recognition with local

Creating Databases and Tables Introduction to Databases in Python Creating Databases

Inductive Inductive Inductive Inductive Databases Databases Databases Databases and

Lecture 11: Persistent Memory Databases 1 / 71 Persistent Memory Databases Recap

Module 3: Creating and Managing Databases Overview Creating Databases Creating

INSTANCE BASED LEARNING 2 Instance-Based Learning Distance function defines whats learned

Test Instance Generation Test Instance Generation for MAX 2SAT for MAX 2SAT Mitsuo Motoki

Weak gravity conjecture, Multiple point principle and SM landscape Yuta Hamada

Pair Production and Gravity as the Weakest Force 2005.07720 w/ L. E. Ibez Eduardo Gonzalo

Weak Gravity Conjecture from Black Hole Entropy Clifford Cheung CC, Liu, Remmen (1801.08546)

Riemann surfaces for KPZ fluctuations in finite volume Sylvain Prolhac Laboratoire de physique

Geothermal Energy Geothermal Energy for heating in Europe for heating in Europe - status and

Modelling subglacial drainage and its role in ice-ocean interaction Ian Hewitt (University of

(Pre-)Algebras for Linguistics 3. Trees Carl Pollard Linguistics 680: Formal Foundations

rt t r st r

Divide And Conquer Small And Large Instance Small instance. Sort a list that has n <=

Divide And Conquer Small And Large Instance Small instance. Sort a list that has n <=