Lecture: Visual Bag of Words Juan Carlos Niebles and Ranjay Krishna - PowerPoint PPT Presentation

Visual bag of wods Lecture: Visual Bag of Words Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab 07-Nov-2019 1 1 St Stanfor ord University

CS 131 Roadmap Visual bag of wods Pixels Segments Images Videos Web Recognition Neural networks Convolutions Resizing Motion Detection Convolutional Edges Segmentation Tracking Machine learning neural networks Descriptors Clustering 07-Nov-2019 2 St Stanfor ord University

What we will learn today • Visual bag of words (BoW) • Spatial Pyramid Matching Visual bag of wods • Naive Bayes 07-Nov-2019 3 3 St Stanfor ord University

What we will learn today • Visual bag of words (BoW) • Spatial Pyramid Matching Visual bag of wods • Naïve Bayes 07-Nov-2019 4 4 St Stanfor ord University

Object Bag of ‘words’ Visual bag of wods 07-Nov-2019 5 St Stanfor ord University

Origin 1: Texture Recognition Visual bag of wods 07-Nov-2019 Example textures (from Wikipedia) 6 St Stanfor ord University

Origin 1: Texture Recognition • Texture is characterized by the repetition of basic elements or textons Visual bag of wods 07-Nov-2019 Julesz, 1981; Cula & Dana, 2001; Leung & Malik 2001; Mori, Belongie & Malik, 2001; Schmid 2001; Varma & Zisserman, 2002, 2003; Lazebnik, Schmid & Ponce, 2003 7 St Stanfor ord University

Origin 1: Texture recognition histogram Visual bag of wods Universal texton dictionary Universal texton dictionary 07-Nov-2019 8 St Stanfor ord University

Origin 2: Bag-of-words models • Orderless document representation: frequencies of words from a dictionary Salton & McGill (1983) Visual bag of wods 07-Nov-2019 9 St Stanfor ord University

10 Origin 2: Bag-of-words models • Orderless document representation: frequencies of words from a dictionary Salton & McGill (1983) Visual bag of wods 07-Nov-2019 US Presidential Speeches Tag Cloud http://chir.ag/phernalia/preztags/ 10 St Stanfor ord University

Bags of features for object recognition Visual bag of wods face, flowers, building 07-Nov-2019 Works pretty well for image-level classification and for recognizing • object instances 13 St Stanfor ord University Csurka et al. (2004), Willamowski et al. (2005), Grauman & Darrell (2005), Sivic et al. (2003, 2005)

Bags of features for object recognition Caltech6 dataset Visual bag of wods bag of features bag of features Parts-and-shape model 07-Nov-2019 14 St Stanfor ord University

Bag of features Visual bag of wods • First, take a bunch of images, extract features, and build up a “dictionary” or “visual vocabulary” – a list of common features • Given a new image, extract features and build a histogram – for each feature, find the closest visual word in the dictionary 07-Nov-2019 15 St Stanfor ord University

Bag of features: outline 1. Extract features Visual bag of wods 07-Nov-2019 16 St Stanfor ord University

Bag of features: outline 1. Extract features Visual bag of wods 2. Learn “visual vocabulary” 07-Nov-2019 17 St Stanfor ord University

Bag of features: outline 1. Extract features Visual bag of wods 2. Learn “visual vocabulary” 3. Quantize features using visual vocabulary 07-Nov-2019 18 St Stanfor ord University

Bag of features: outline 1. Extract features Visual bag of wods 2. Learn “visual vocabulary” 3. Quantize features using visual vocabulary 4. Represent images by frequencies of “visual words” 07-Nov-2019 19 St Stanfor ord University

1. Feature extraction • Regular grid – Vogel & Schiele, 2003 Visual bag of wods – Fei-Fei & Perona, 2005 07-Nov-2019 20 St Stanfor ord University

1. Feature extraction • Regular grid – Vogel & Schiele, 2003 Visual bag of wods – Fei-Fei & Perona, 2005 • Interest point detector – Csurka et al. 2004 – Fei-Fei & Perona, 2005 – Sivic et al. 2005 07-Nov-2019 21 St Stanfor ord University

1. Feature extraction • Regular grid – Vogel & Schiele, 2003 Visual bag of wods – Fei-Fei & Perona, 2005 • Interest point detector – Csurka et al. 2004 – Fei-Fei & Perona, 2005 – Sivic et al. 2005 • Other methods 07-Nov-2019 – Random sampling (Vidal-Naquet & Ullman, 2002) – Segmentation-based patches (Barnard et al. 2003) 22 Stanfor St ord University

2. Learning the visual vocabulary … Visual bag of wods 07-Nov-2019 23 St Stanfor ord University

2. Learning the visual vocabulary … Visual bag of wods 07-Nov-2019 Clustering 24 St Stanfor ord University Slide credit: Josef Sivic

2. Learning the visual vocabulary Visual vocabulary … Visual bag of wods 07-Nov-2019 Clustering 25 St Stanfor ord University Slide credit: Josef Sivic

K-means clustering recap • Want to minimize sum of squared Euclidean distances between points x i and their nearest cluster centers m k Visual bag of wods å å = - 2 D ( X , M ) ( x m ) i k cluster k point i in cluster k Algorithm: • • Randomly initialize K cluster centers • Iterate until convergence: 07-Nov-2019 Assign each data point to the nearest center – Recompute each cluster center as the mean of all points – assigned to it 26 Stanfor St ord University

From clustering to vector quantization • Clustering is a common method for learning a visual vocabulary or codebook Visual bag of wods – Unsupervised learning process – Each cluster center produced by k-means becomes a codevector – Codebook can be learned on separate training set – Provided the training set is sufficiently representative, the codebook will be “universal” • The codebook is used for quantizing features 07-Nov-2019 – A vector quantizer takes a feature vector and maps it to the index of the nearest codevector in a codebook – Codebook = visual vocabulary – Codevector = visual word 27 Stanfor St ord University

Example visual vocabulary Visual bag of wods 07-Nov-2019 28 St Stanfor ord University Fei-Fei et al. 2005

Image patch examples of visual words Visual bag of wods 07-Nov-2019 29 St Stanfor ord University Sivic et al. 2005

Visual vocabularies: Issues • How to choose vocabulary size? Visual bag of wods – Too small: visual words not representative of all patches – Too large: quantization artifacts, overfitting • Computational efficiency – Vocabulary trees (Nister & Stewenius, 2006) 07-Nov-2019 30 St Stanfor ord University

3. Image representation Visual bag of wods frequency 07-Nov-2019 ….. codewords 31 St Stanfor ord University

Image classification • Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them? Visual bag of wods 07-Nov-2019 32 St Stanfor ord University

Uses of BoW representation • Treat as feature vector for standard classifier Visual bag of wods – e.g k-nearest neighbors, support vector machine • Cluster BoW vectors over image collection – Discover visual themes 07-Nov-2019 33 St Stanfor ord University

Large-scale image matching • Bag-of-words models have been useful in matching an image to a large database of Visual bag of wods object instances 11,400 images of game covers (Caltech games dataset) 07-Nov-2019 how do I find this image in the database? 34 St Stanfor ord University

Large-scale image search Build the database: – Extract features from the database Visual bag of wods images – Learn a vocabulary using k-means (typical k: 100,000) – Compute weights for each word – Create an inverted file mapping words à images 07-Nov-2019 35 St Stanfor ord University

Weighting the words • Just as with text, some visual words are more discriminative than others Visual bag of wods the, and, or vs. cow, AT&T, Cher • the bigger fraction of the documents a word appears in, the less useful it is for matching – e.g., a word that appears in all documents is not helping us 07-Nov-2019 36 St Stanfor ord University

Large-scale image search query image top 6 results Visual bag of wods 07-Nov-2019 • Cons: – performance degrades as the database grows 41 St Stanfor ord University

Large-scale image search • Pros: Visual bag of wods – Works well for CD covers, movie posters – Real-time performance possible 07-Nov-2019 real-time retrieval from a database of 40,000 CD covers Nister & Stewenius, Scalable Recognition with a Vocabulary Tree 42 St Stanfor ord University

Example bag-of-words matches Visual bag of wods 07-Nov-2019 43 St Stanfor ord University

Example bag-of-words matches Visual bag of wods 07-Nov-2019 44 St Stanfor ord University

Lecture: Visual Bag of Words Juan Carlos Niebles and Ranjay Krishna - PowerPoint PPT Presentation

Visual bag of wods Lecture: Visual Bag of Words Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab 07-Nov-2019 1 1 St Stanfor ord University CS 131 Roadmap Visual bag of wods Pixels Segments Images Videos Web

Bag of Words Model Overview of todays lecture Bag-of-words. K-means clustering.

WINE BOTTLE AIRBAG SINGLE WINE BOTTLE AIRBAG SINGLE BOTTLE AIR BAG PROTECT ALL BOTTLED PRODUCT

Red-Bag Engineers Consultants Software User Day April 2017 Red-Bag 2017 1 Ves Online

Pathway Red Bag Scheme October 2018 The Red Bag concept The Red Bag scheme was first implemented

The Plastic Bag Free world in action Surfriders Ban the Bag Campaign Plastic bag free

Bag-of-Visual-Words 16-385 Computer Vision (Kris Kitani) Carnegie Mellon University What object

Efficient visual search of local features Cordelia Schmid Bag-of-features

Text Representation Bag-of-Words and Word Embeddings count vector unordered bag over

DC Bag Law Presented by Jeffrey Seltzer Associate Director Stormwater Management Division District

Biovision team 2 Retina Visual cortex 3 Retina Visual cortex 3 Retina Visual cortex 3

Nave Bayes CMSC 473/673 UMBC Some slides adapted from 3SLP Outline Terminology: bag-of-words

CHRONIC CHRONIC VISUAL LOSS VISUAL LOSS Wasu Supakornthanasarn, MD. Visual loss Sensory

A Model of Visual Imagery A Model of Visual Imagery John Abbondanza, OD, FCOVD John Abbondanza,

Overview Overview Visual displays Visual displays Visual and tactile displays Visual and

Words & Pictures Clustering and Bag of Words Many

Extracting keywords from images Bag-of-visual-words enriched with graph techniques Gjorgji

Monroe Street Reconstruction Green Infrastructure Focus Group February 16, 2017 Monroe Street

lecture 10 MIPS assembly language 3 - arrays - strings - MIPS assembler directives and

Federated Optimization in Heterogeneous Networks Tian Li (CMU) , Anit Kumar Sahu (BCAI), Manzil

LB-MAP: LOAD-BALANCED MIDDLEBOX ASSIGNMENT IN POLICY-DRIVEN DATA CENTERS MANAR ALQARNI

Sally y O'Don onnell, , LCSW sally@sallyodonnell.com www.sallyodonnell.com/oasis - Resources

COVID19 COFFEE CHAT: BUILDING RESILIENCE FOR FRONT LINE PROVIDERS: WHAT DOES SELF - CARE

Kidney Chat: Ask a Social Worker Kathy Merritt, LCSW Thanks to our speaker! Kathy Merritt, LCSW

Right Supports, Right Time: Implementing a Coordinated Specialty Care Team and Program National

Lecture: Visual Bag of Words Juan Carlos Niebles and Ranjay Krishna - PowerPoint PPT Presentation

Visual bag of wods Lecture: Visual Bag of Words Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab 07-Nov-2019 1 1 St Stanfor ord University CS 131 Roadmap Visual bag of wods Pixels Segments Images Videos Web

Bag of Words Model Overview of todays lecture Bag-of-words. K-means clustering.

WINE BOTTLE AIRBAG SINGLE WINE BOTTLE AIRBAG SINGLE BOTTLE AIR BAG PROTECT ALL BOTTLED PRODUCT

Red-Bag Engineers Consultants Software User Day April 2017 Red-Bag 2017 1 Ves Online

Pathway Red Bag Scheme October 2018 The Red Bag concept The Red Bag scheme was first implemented

The Plastic Bag Free world in action Surfriders Ban the Bag Campaign Plastic bag free

Bag-of-Visual-Words 16-385 Computer Vision (Kris Kitani) Carnegie Mellon University What object

Efficient visual search of local features Cordelia Schmid Bag-of-features

Text Representation Bag-of-Words and Word Embeddings count vector unordered bag over

DC Bag Law Presented by Jeffrey Seltzer Associate Director Stormwater Management Division District

Biovision team 2 Retina Visual cortex 3 Retina Visual cortex 3 Retina Visual cortex 3

Nave Bayes CMSC 473/673 UMBC Some slides adapted from 3SLP Outline Terminology: bag-of-words

CHRONIC CHRONIC VISUAL LOSS VISUAL LOSS Wasu Supakornthanasarn, MD. Visual loss Sensory

A Model of Visual Imagery A Model of Visual Imagery John Abbondanza, OD, FCOVD John Abbondanza,

Overview Overview Visual displays Visual displays Visual and tactile displays Visual and

Words &amp; Pictures Clustering and Bag of Words Many

Extracting keywords from images Bag-of-visual-words enriched with graph techniques Gjorgji

Monroe Street Reconstruction Green Infrastructure Focus Group February 16, 2017 Monroe Street

lecture 10 MIPS assembly language 3 - arrays - strings - MIPS assembler directives and

Federated Optimization in Heterogeneous Networks Tian Li (CMU) , Anit Kumar Sahu (BCAI), Manzil

LB-MAP: LOAD-BALANCED MIDDLEBOX ASSIGNMENT IN POLICY-DRIVEN DATA CENTERS MANAR ALQARNI

Sally y O'Don onnell, , LCSW sally@sallyodonnell.com www.sallyodonnell.com/oasis - Resources

COVID19 COFFEE CHAT: BUILDING RESILIENCE FOR FRONT LINE PROVIDERS: WHAT DOES SELF - CARE

Kidney Chat: Ask a Social Worker Kathy Merritt, LCSW Thanks to our speaker! Kathy Merritt, LCSW

Right Supports, Right Time: Implementing a Coordinated Specialty Care Team and Program National

Words & Pictures Clustering and Bag of Words Many