Introduction to Visual Search and Recognition Visual Search - PDF document

Perceptual and Sensory Augmented Computing Introduction to Visual Search and Recognition Visual Search Tutorial Global representations: limitations • Success may rely on alignment -> sensitive to viewpoint Perceptual and Sensory Augmented Computing • All parts of the image or window impact the description -> sensitive to occlusion, clutter Visual Search Tutorial 2

Local representations • Describe component regions or patches separately. • Many options for detection & description… Perceptual and Sensory Augmented Computing Maximally Stable Extremal Regions Shape context Superpixels [Matas 02] SIFT [Lowe 99] [Belongie 02] [Ren et al.] Visual Search Tutorial Salient regions Harris-Affine Spin images Geometric Blur [Kadir 01] [Mikolajczyk 04] [Johnson 99] [Berg 05] 3 Recall: Invariant local features Subset of local feature types designed to be invariant to y 1 Perceptual and Sensory Augmented Computing y 2 Scale  … Translation  y d Rotation  Affine transformations  Illumination  x 1 x 2 1) Detect interest points Visual Search Tutorial … 2) Extract descriptors x d [Mikolaj czyk01, Matas02, Tuytelaars04, Lowe99, Kadir01,… ]

Recognition with local feature sets • Previously, we saw how to use local invariant features + a global Perceptual and Sensory Augmented Computing spatial model to recognize specific objects, using a planar object assumption. • Now, we’ll use local features for  Indexing-based recognition  Bags of words representations Visual Search Tutorial  Correspondence / matching kernels 5 Basic flow … … Index each one into pool of descriptors from … Perceptual and Sensory Augmented Computing previously seen images Describe Detect or sample features features Visual Search Tutorial List of positions, Associated list of scales, d-dimensional orientations descriptors 6

Indexing local features • Each patch / region has a descriptor, which is a point in some high-dimensional feature space (e.g., SIFT) Perceptual and Sensory Augmented Computing Visual Search Tutorial Indexing local features • When we see close points in feature space, we have similar descriptors, which indicates similar local Perceptual and Sensory Augmented Computing content. Visual Search Tutorial Figure credit: A. Zisserman

Indexing local features • We saw in the previous section how to use voting and pose clustering to identify objects using local features Perceptual and Sensory Augmented Computing Visual Search Tutorial Figure credit: David Lowe 9 Indexing local features • With potentially thousands of features per image, and hundreds to millions of images to search, how to Perceptual and Sensory Augmented Computing efficiently find those that are relevant to a new image?  Low-dimensional descriptors : can use standard efficient data structures for nearest neighbor search Visual Search Tutorial  High-dimensional descriptors: approximate nearest neighbor search methods more practical  Inverted file indexing schemes 10

Indexing local features: approximate nearest neighbor search Best-Bin First (BBF), a variant of k-d Perceptual and Sensory Augmented Computing trees that uses priority queue to examine most promising branches first [Beis & Lowe, CVPR 1997] Visual Search Tutorial Locality-Sensitive Hashing (LSH), a randomized hashing technique using hash functions that map similar points to the same bin, with high probability [Indyk & Motwani, 1998] 11 Indexing local features: inverted file index • For text documents, an efficient way to Perceptual and Sensory Augmented Computing find all pages on which a word occurs is to use an index… • We want to find all images in which a feature occurs. Visual Search Tutorial • To use this idea, we’ll need to map our features to “visual words”. 12 K. Grauman, B. Leibe

Visual words: main idea • Extract some local features from a number of images … Perceptual and Sensory Augmented Computing Visual Search Tutorial e.g., S IFT descriptor space: each point is 128-dimensional 13 S lide credit: D. Nister Visual words: main idea Perceptual and Sensory Augmented Computing Visual Search Tutorial 14 S lide credit: D. Nister

Visual words: main idea Perceptual and Sensory Augmented Computing Visual Search Tutorial 15 S lide credit: D. Nister Visual words: main idea Perceptual and Sensory Augmented Computing Visual Search Tutorial 16 S lide credit: D. Nister

Visual Search Tutorial Perceptual and Sensory Augmented Computing Perceptual and Sensory Augmented Computing Visual Search Tutorial S S lide credit: D. Nister lide credit: D. Nister 18 17

Visual words: main idea Map high-dimensional descriptors to tokens/words by quantizing the feature space Perceptual and Sensory Augmented Computing • Quantize via clustering, let cluster centers be the prototype “ words” Visual Search Tutorial Descriptor space 19 Visual words: main idea Map high-dimensional descriptors to tokens/words by quantizing the feature space Perceptual and Sensory Augmented Computing • Determine which word to assign to each new image region by finding the closest cluster center. Visual Search Tutorial Descriptor space 20

Visual words • Example: each group of patches Perceptual and Sensory Augmented Computing belongs to the same visual word Visual Search Tutorial Figure from S ivic & Zisserman, ICCV 2003 21 Visual words • First explored for texture and material Perceptual and Sensory Augmented Computing representations • Texton = cluster center of filter responses over collection of images • Describe textures and materials based on Visual Search Tutorial distribution of prototypical texture elements. Leung & Malik 1999; Varma & Zisserman, 2002; Lazebnik, S chmid & Ponce, 2003;

Visual words • More recently used for describing scenes and Perceptual and Sensory Augmented Computing objects for the sake of indexing or classification. Visual Search Tutorial S ivic & Zisserman 2003; Csurka, Bray, Dance, & Fan 2004; many others. 23 K. Grauman, B. Leibe Inverted file index for images comprised of visual words Word List of image number numbers Perceptual and Sensory Augmented Computing Visual Search Tutorial Image credit: A. Zisserman K. Grauman, B. Leibe

Bags of visual words • Summarize entire image based on its distribution Perceptual and Sensory Augmented Computing (histogram) of word occurrences. • Analogous to bag of words representation commonly used for documents. Visual Search Tutorial 25 Image credit: Fei-Fei Li Video Google System Query region 1. Collect all words within query region Perceptual and Sensory Augmented Computing 2. Inverted file index to find relevant frames 3. Compare word counts 4. Spatial verification Retrieved frames Sivic & Zisserman, ICCV 2003 Visual Search Tutorial • Demo online at : http://www.robots.ox.ac.uk/~vgg/ research/vgoogle/index.html 26

Basic flow … … Index each one into pool of descriptors from Perceptual and Sensory Augmented Computing … previously seen images or … Quantize to form Describe Detect or sample bag of words vector features features for the image Visual Search Tutorial List of positions, Associated list of scales, d-dimensional orientations descriptors 27 Visual vocabulary formation Issues: • Sampling strategy Perceptual and Sensory Augmented Computing • Clustering / quantization algorithm • Unsupervised vs. supervised • What corpus provides features (universal vocabulary?) • Vocabulary size, number of words Visual Search Tutorial 28

Sampling strategies Perceptual and Sensory Augmented Computing S parse, at Dense, uniformly Randomly interest points • To find specific, textured obj ects, sparse sampling from interest points often more reliable. Visual Search Tutorial • Multiple complementary interest operators offer more image coverage. • For obj ect categorization, dense sampling offers better coverage. Multiple interest operators [S ee Nowak, Jurie & Triggs, ECCV 2006] 29 Image credits: F-F . Li, E. Nowak, J. S ivic Clustering / quantization methods • k-means (typical choice), agglomerative clustering, mean-shift,… Perceptual and Sensory Augmented Computing • Hierarchical clustering: allows faster insertion / word assignment while still allowing large vocabularies  Vocabulary tree [Nister & Stewenius, CVPR 2006] Visual Search Tutorial 30

Example: Recognition with Vocabulary Tree • Tree construction: Perceptual and Sensory Augmented Computing Visual Search Tutorial [Nister & S tewenius, CVPR’ 06] 31 S lide credit: David Nister Vocabulary Tree • Training: Filling the tree Perceptual and Sensory Augmented Computing Visual Search Tutorial [Nister & S tewenius, CVPR’ 06] 32 S lide credit: David Nister

Vocabulary Tree • Training: Filling the tree Perceptual and Sensory Augmented Computing Visual Search Tutorial [Nister & S tewenius, CVPR’ 06] 33 S lide credit: David Nister Vocabulary Tree • Training: Filling the tree Perceptual and Sensory Augmented Computing Visual Search Tutorial [Nister & S tewenius, CVPR’ 06] 34 S lide credit: David Nister

Introduction to Visual Search and Recognition Visual Search - PDF document

Perceptual and Sensory Augmented Computing Introduction to Visual Search and Recognition Visual Search Tutorial Global representations: limitations Success may rely on alignment -> sensitive to viewpoint Perceptual and Sensory Augmented

Efficient visual search of local features Efficient visual search of local features Cordelia

Introduction to Visual Recognition General visual recognition importance for intelligence?

EE 6882 Visual Search Engine Lec. 1: Introduction tinyeye, photo copy search Web image search

Biovision team 2 Retina Visual cortex 3 Retina Visual cortex 3 Retina Visual cortex 3

Instance-level recognition 1) Local invariant features 2) Matching and recognition with local

Instance-level recognition 1) Local invariant features 2) Matching and recognition with local

Search Engines Issues Avi Rappoport Search Tools Consulting Search Issues Enterprise Search

A summary of deep models for face recognition Qianli Liao Face recognition Face recognition:

8-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches

Learning about images from keyword-based Web search CS 395T: Visual Recognition and Search

Audio- -Visual Automatic Speech Recognition: Visual Automatic Speech Recognition: Audio Theory,

Image Retrieval with CNN Giorgos Tolias Visual Recognition Group, CTU in Prague CVPR 2017

Machine visual perception Cordelia Schmid INRIA Grenoble Machine visual perception

Overview Overview Visual displays Visual displays Visual and tactile displays Visual and

CHRONIC CHRONIC VISUAL LOSS VISUAL LOSS Wasu Supakornthanasarn, MD. Visual loss Sensory

A Model of Visual Imagery A Model of Visual Imagery John Abbondanza, OD, FCOVD John Abbondanza,

Stakeholder Briefing on New Nuclear Plants NRC Commission Briefing May 29, 2002 Building

Neurocognitive Screening Judith Restrepo, MD Attending in Consultation-Liaison Psychiatry

Neurocognitive Dysfunction in Schizophrenia Attention/Vigilance Immediate Memory Secondary

The Value of Decentralized Non-Potable Water Systems Placeholder for Speakers name for the

Lecture 2: Object Detec.on Professor Fei-Fei Li

Representing Uncertainty in Graph Edges: An Evaluation of Paired Visual Variables CAROLINA ROMN

What looks good with my sofa: Multimodal Search Engine for Interior Design Ivona Tautkute,

6. Language Development 6.1 Vocalization and Sound 6.2 Language and Thought 6.3