Mining, and Intro to Categorization
Tues April 10 Kristen Grauman UT Austin
UT Austin, CS 376 Computer Vision - lecture 21
Mining, and Intro to Categorization Tues April 10 Kristen Grauman - - PowerPoint PPT Presentation
Mining, and Intro to Categorization Tues April 10 Kristen Grauman UT Austin UT Austin, CS 376 Computer Vision - lecture 21 Recognition and learning Recognizing categories (objects, scenes, activities, attributes), learning techniques
Tues April 10 Kristen Grauman UT Austin
UT Austin, CS 376 Computer Vision - lecture 21
Recognizing categories (objects, scenes, activities, attributes…), learning techniques
UT Austin, CS 376 Computer Vision - lecture 21
UT Austin, CS 376 Computer Vision - lecture 21
bag-of-words image retrieval? Why or why not?
Generalized Hough spatial verification model?
like?
UT Austin, CS 376 Computer Vision - lecture 21
UT Austin, CS 376 Computer Vision - lecture 21
Q
111101 110111 110101
r1…rk
<< N
Q
Guarantees approximate near neighbors in sub-linear time, given appropriate hash functions.
Xi N
[Indyk and Motwani ‘98, Gionis et al.’99, Charikar ‘02, Andoni et al. ‘04]
Kristen Grauman
UT Austin, CS 376 Computer Vision - lecture 21
The probability that a random hyperplane separates two unit vectors depends on the angle between them:
[Goemans and Williamson 1995, Charikar 2004]
High dot product: unlikely to split Lower dot product: likely to split
Corresponding hash function:
for
Kristen Grauman
UT Austin, CS 376 Computer Vision - lecture 21
A1 ∩ A2 A1 U A2
A1 A2
[Broder, 1999]
Kristen Grauman
UT Austin, CS 376 Computer Vision - lecture 21
1 4 5 2 6 3
0.63 0.88 0.55 0.94 0.31 0.19 0.07 0.75 0.59 0.22 0.90 0.41
A C D E B F
Vocabulary
A C B C D B A E F
f1:
C C F
f2: 4 5 3 6 2 1
A B A
f3: 5 4 6 1 2 3
C C A
f4: 2 1 6 5 3 4
B B E
Set A Set B Set C Random orderings min-Hash
~ Un (0,1) ~ Un (0,1)
Slide credit: Ondrej Chum
[Broder, 1999]
UT Austin, CS 376 Computer Vision - lecture 21
A E Q R V A J A C Q V Z E Q V E R J C Z Y
A: B: A U B: P(h(A) = h(B)) = |A ∩ B| |A U B| h2(A) h2(B)
Q
h1(A) h1(B)
A A C
Ordering by f1 Ordering by f2
Y
Slide credit: Ondrej Chum
[Broder, 1999]
UT Austin, CS 376 Computer Vision - lecture 21
concatenate outputs into hash key:
independently generated hash tables
– Search/rank the union of collisions in each table, or – Require that two examples in at least T
k k k
y x sim y h x h ) , ( ) ( ) ( P
,..., 1 ,..., 1
111101 110111 110101 111101 110111 110101 111001 111111 110100
TABLE 1 TABLE 2
Kristen Grauman
UT Austin, CS 376 Computer Vision - lecture 21
In addition to visual search, want to be able to summarize, mine, and rank the large collection as a whole.
are most representative?
Kristen Grauman
UT Austin, CS 376 Computer Vision - lecture 21
In addition to visual search, want to be able to summarize, mine, and rank the large collection as a whole. We’ll look at a few examples:
– [Geometric Min-hash, Chum et al. 2009]
– [Jing and Baluja, 2008]
– [Quack et al., 2007]
Kristen Grauman
UT Austin, CS 376 Computer Vision - lecture 21
1.Detect seed pairs via hash collisions 2.Hash to related images 3.Compute connected components of the graph
Slide credit: Ondrej Chum
Contrast with frequently used quadratic-time clustering algorithms
UT Austin, CS 376 Computer Vision - lecture 21
hash key construction:
– Select first hash output according to min hash (“central word”) – Then append subsequent hash outputs from within its neighborhood
[Chum, Perdoch, Matas, CVPR 2009]
E B F
Figure from Ondrej Chum UT Austin, CS 376 Computer Vision - lecture 21
[Chum, Perdoch, Matas, CVPR 2009]
Hertford Keble Magdalen Pitt Rivers Radcliffe Camera All Soul's Ashmolean Balliol Bodleian Christ Church Cornmarket
100 000 Images downloaded from FLICKR Includes 11 Oxford Landmarks with manually labeled ground truth
Slide credit: Ondrej Chum UT Austin, CS 376 Computer Vision - lecture 21
[Chum, Perdoch, Matas, CVPR 2009]
Slide credit: Ondrej Chum
Discovering small objects
UT Austin, CS 376 Computer Vision - lecture 21
[Chum, Perdoch, Matas, CVPR 2009]
Slide credit: Ondrej Chum
Discovering small objects
UT Austin, CS 376 Computer Vision - lecture 21
In addition to visual search, want to be able to summarize, mine, and rank the large collection as a whole. We’ll look briefly at a few recent examples:
[Geometric Min-hash, Chum et al. 2009]
Baluja, 2008]
[Quack et al., 2007]
UT Austin, CS 376 Computer Vision - lecture 21
small set of “best” images to display among millions
Product search Mixed-type search
Kristen Grauman
UT Austin, CS 376 Computer Vision - lecture 21
based on random walk principle.
– Application of PageRank to visual data
– Graph weights = number of matched local features between two images – Exploit text search to narrow scope of each graph – Use LSH to make similarity computations efficient
[Jing and Baluja, PAMI 2008]
Kristen Grauman
UT Austin, CS 376 Computer Vision - lecture 21
[Jing and Baluja, PAMI 2008]
Original has more matches to rest Similarity graph generated from top 1,000 text search results of “Mona-Lisa” Highest visual rank!
Kristen Grauman
UT Austin, CS 376 Computer Vision - lecture 21
[Jing and Baluja, PAMI 2008]
Similarity graph generated from top 1,000 text search results of “Lincoln Memorial”. Note the diversity of the high-ranked images.
Kristen Grauman
UT Austin, CS 376 Computer Vision - lecture 21
In addition to visual search, want to be able to summarize, mine, and rank the large collection as a whole. We’ll look briefly at a few recent examples:
[Geometric Min-hash, Chum et al. 2009]
Baluja, 2008]
[Quack et al., 2007]
UT Austin, CS 376 Computer Vision - lecture 21
Kristen Grauman
UT Austin, CS 376 Computer Vision - lecture 21
features frequently occur in large collection?
(visual word layouts) that
(images)
data mining (e.g., Apriori algorithm, Agrawal 1993)
[Quack, Ferrari, Leibe, Van Gool, CIVR 2006, ICCV 2007]
Kristen Grauman
UT Austin, CS 376 Computer Vision - lecture 21
[Quack, Ferrari, Leibe, Van Gool, CIVR 2006, ICCV 2007]
Kristen Grauman
UT Austin, CS 376 Computer Vision - lecture 21
Two example itemset clusters
[Quack, Ferrari, Leibe, Van Gool, CIVR 2006, ICCV 2007]
Kristen Grauman
UT Austin, CS 376 Computer Vision - lecture 21
Discovering Favorite Views of Popular Places with Iconoid
Kristen Grauman
UT Austin, CS 376 Computer Vision - lecture 21
UT Austin, CS 376 Computer Vision - lecture 21
Fei-Fei Li
UT Austin, CS 376 Computer Vision - lecture 21
UT Austin, CS 376 Computer Vision - lecture 21
UT Austin, CS 376 Computer Vision - lecture 21
mountain building tree banner vendor people street lamp
UT Austin, CS 376 Computer Vision - lecture 21
Potala Palace A particular sign
UT Austin, CS 376 Computer Vision - lecture 21
UT Austin, CS 376 Computer Vision - lecture 21
flat gray made of fabric crowded
UT Austin, CS 376 Computer Vision - lecture 21
Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial
UT Austin, CS 376 Computer Vision - lecture 21
recognize a-priori unknown instances of that category and assign the correct category label.”
German shepherd animal dog living being “Fido”
Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial
UT Austin, CS 376 Computer Vision - lecture 21
[Rosch 76, Lakoff 87]
perceived shape
entire category
identifying category members
for interaction with category members
Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial
UT Austin, CS 376 Computer Vision - lecture 21
predominantly visually.
start with basic-level categorization before doing identification.
Basic-level categorization is easier and faster for humans than object identification!
How does this transfer to automatic
classification algorithms?
Basic level Individual level Abstract levels “Fido”
dog animal quadruped German shepherd Doberman cat cow … … … … … …
Biederman 1987
Source: Fei-Fei Li, Rob Fergus, Antonio Torralba. UT Austin, CS 376 Computer Vision - lecture 21
UT Austin, CS 376 Computer Vision - lecture 21
Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial
UT Austin, CS 376 Computer Vision - lecture 21
– Recognition a fundamental part of perception
– Organize and give access to visual content
UT Austin, CS 376 Computer Vision - lecture 21
http://www.darpa.mil/grandchallenge/gallery.asp
UT Austin, CS 376 Computer Vision - lecture 21
Kooaba, Bay & Quack et al. Yeh et al., MIT Belhumeur et al.
Slide: Kristen Grauman
UT Austin, CS 376 Computer Vision - lecture 21
Slide: Kristen Grauman
UT Austin, CS 376 Computer Vision - lecture 21
Snavely et al. Simon & Seitz
Slide: Kristen Grauman
UT Austin, CS 376 Computer Vision - lecture 21
Sivic & Zisserman Lee & Grauman Wang et al.
Objects Actions Categories
Slide: Kristen Grauman
UT Austin, CS 376 Computer Vision - lecture 21
Gammeter et al.
Slide: Kristen Grauman
UT Austin, CS 376 Computer Vision - lecture 21
Illumination Object pose Clutter Viewpoint Intra-class appearance Occlusions
Slide: Kristen Grauman
UT Austin, CS 376 Computer Vision - lecture 21
Context cues
Slide: Kristen Grauman
UT Austin, CS 376 Computer Vision - lecture 21
Context cues Function Dynamics
Video credit: J. Davis
Slide: Kristen Grauman
UT Austin, CS 376 Computer Vision - lecture 21
per day! …
devoted to processing visual information [Felleman and van Essen 1991]
Slide: Kristen Grauman
UT Austin, CS 376 Computer Vision - lecture 21
More Less
Slide: Kristen Grauman
UT Austin, CS 376 Computer Vision - lecture 21
Slide from Pietro Perona, 2004 Object Recognition workshop
UT Austin, CS 376 Computer Vision - lecture 21
Slide from Pietro Perona, 2004 Object Recognition workshop
UT Austin, CS 376 Computer Vision - lecture 21
Recognizing flat, textured
covers, posters) Reading license plates, zip codes, checks Fingerprint recognition Frontal face detection
UT Austin, CS 376 Computer Vision - lecture 21
UT Austin, CS 376 Computer Vision - lecture 21
COIL Roberts 1963
1996 1963 …
Slide: Kristen Grauman
UT Austin, CS 376 Computer Vision - lecture 21
INRIA Pedestrians INRIA Pedestrians UIUC Cars UIUC Cars MIT-CMU Faces MIT-CMU Faces INRIA Pedestrians UIUC Cars MIT-CMU Faces
2000
1996 1963 …
Slide: Kristen Grauman
UT Austin, CS 376 Computer Vision - lecture 21
Caltech-256 Caltech-256 Caltech-101 Caltech-101 MSRC 21 Objects MSRC 21 Objects Caltech-256 Caltech-101 MSRC 21 Objects
2000 2005
1996 1963 …
Slide: Kristen Grauman
UT Austin, CS 376 Computer Vision - lecture 21
Faces in the Wild Faces in the Wild 80M Tiny Images 80M Tiny Images Birds-200 Birds-200 PASCAL VOC PASCAL VOC ImageNet ImageNet Faces in the Wild 80M Tiny Images Birds-200 PASCAL VOC PASCAL VOC PASCAL VOC ImageNet
2000 2005 2007 2008 2013
1996 1963 …
Slide: Kristen Grauman
UT Austin, CS 376 Computer Vision - lecture 21
Slide: Kristen Grauman
UT Austin, CS 376 Computer Vision - lecture 21
https://pdollar.wordpress.com/2015/01/21/image-captioning/
UT Austin, CS 376 Computer Vision - lecture 21
UT Austin, CS 376 Computer Vision - lecture 21
KITTI dataset – Andreas Geiger et al.
UT Austin, CS 376 Computer Vision - lecture 21
WhittleSearch – Adriana Kovashka et al.
Slide: Kristen Grauman
UT Austin, CS 376 Computer Vision - lecture 21
Activities of Daily Living – Hamed Pirsiavash et al.
UT Austin, CS 376 Computer Vision - lecture 21
learning of features and models*,**
* Labeled data availability ** Architecture design decisions, parameters.
UT Austin, CS 376 Computer Vision - lecture 21
UT Austin, CS 376 Computer Vision - lecture 21