Lecture 6 - Fei-Fei Li, Jonathan Krause 1
Lecture 6: Introduction to Detection Jonathan Krause Fei-Fei Li, - - PowerPoint PPT Presentation
Lecture 6: Introduction to Detection Jonathan Krause Fei-Fei Li, - - PowerPoint PPT Presentation
Lecture 6: Introduction to Detection Jonathan Krause Fei-Fei Li, Jonathan Krause Lecture 6 - 1 Goal Locate objects in images Fei-Fei Li, Jonathan Krause Lecture 6 - 2 Variants: Pedestrian Detection Leibe et al., 2005 Fei-Fei Li,
Lecture 6 - Fei-Fei Li, Jonathan Krause
- Locate objects in images
Goal
2
Lecture 6 - Fei-Fei Li, Jonathan Krause
Variants: Pedestrian Detection
3 Leibe et al., 2005
Lecture 6 - Fei-Fei Li, Jonathan Krause
Variants: Face Detection
4
Lecture 6 - Fei-Fei Li, Jonathan Krause
Variants: Instance Detection
5 Lowe 2004
Lecture 6 - Fei-Fei Li, Jonathan Krause
Variants: Multi-Class Detection
6
Lecture 6 - Fei-Fei Li, Jonathan Krause
Application: Tagging People
7
Putin Obama
Lecture 6 - Fei-Fei Li, Jonathan Krause
Application: Autonomous Driving
8 Huval et al., 2015
Lecture 6 - Fei-Fei Li, Jonathan Krause
Application: Robotics
9 Lai et al., 2012
Lecture 6 - Fei-Fei Li, Jonathan Krause
Application: Tracking
10 Berclaz et al., 2011
Lecture 6 - Fei-Fei Li, Jonathan Krause
Application: Segmentation
11 Hariharan et al., 2014
Lecture 6 - Fei-Fei Li, Jonathan Krause
- 1. Sliding Window Methods
- 2. Region-based Methods
- 3. Extra Topics
Outline
12
Lecture 6 - Fei-Fei Li, Jonathan Krause
- 1. Sliding Window Methods
- 1. Overview
- 2. Viola-Jones Face Detection
- 3. HOG
- 4. Exemplar SVM
- 5. DPM
- 2. Region-based Methods
- 3. Extra Topics
Outline
13
Lecture 6 - Fei-Fei Li, Jonathan Krause
Getting Started: Kitten Detection
14
Goal: Detect all kittens
Lecture 6 - Fei-Fei Li, Jonathan Krause
Checking Windows for Kittens
15
No
Lecture 6 - Fei-Fei Li, Jonathan Krause
Checking Windows for Kittens
16
No
Lecture 6 - Fei-Fei Li, Jonathan Krause
Checking Windows for Kittens
17
No
Lecture 6 - Fei-Fei Li, Jonathan Krause
Checking Windows for Kittens
18
No
Lecture 6 - Fei-Fei Li, Jonathan Krause
Sliding Windows
19
Evaluate every bounding box position
Lecture 6 - Fei-Fei Li, Jonathan Krause
- Even if we search all 2d positions, still don’t
know aspect ratio or scale.
Aspect Ratio and Scale
20
- Solution: Multiple aspect ratios and multi-scale
Lecture 6 - Fei-Fei Li, Jonathan Krause
- Extremely fast
- Very accurate (at the time)
Viola Jones Face Detector
21 Viola, Jones. 2001
Lecture 6 - Fei-Fei Li, Jonathan Krause
Viola Jones
22 Viola, Jones. 2001
Key Idea: Boosting on weak classifiers
Lecture 6 - Fei-Fei Li, Jonathan Krause
Haar Filters
23 Viola, Jones. 2001
Simple patterns of lightness and darkness
Lecture 6 - Fei-Fei Li, Jonathan Krause
Haar Filters w/Integral Images
24
Filter: Image:
Decomposition: smaller filters
Lecture 6 - Fei-Fei Li, Jonathan Krause
Haar Filters w/Integral Images
25
Response at a single location: =
- +
Only need to compute sum of top-left responses (DP)!
Lecture 6 - Fei-Fei Li, Jonathan Krause
Viola Jones: Weak Classifiers
26 Viola, Jones. 2001
Each Haar filter is a weak classifier
Top classifier Second best
Lecture 6 - Fei-Fei Li, Jonathan Krause
Combining Weak Classifiers
27 Viola, Jones. 2001
AdaBoost:
: binary classifier on Haar filter t : learned weight on classifier t AdaBoost classifier: minimizes loss:
Lecture 6 - Fei-Fei Li, Jonathan Krause
Cascade
28 Viola, Jones. 2001
Reject negatives quickly
Lecture 6 - Fei-Fei Li, Jonathan Krause
- Fast at runtime
- Takes a long time to train
- Very accurate (at the time)
- Inspired other detection methods
Viola Jones Summary
29
Lecture 6 - Fei-Fei Li, Jonathan Krause
- Histograms of Oriented Gradients
- Designed for Pedestrian Detection
- Really just good feature engineering
HOG
30 Dalal, Triggs. 2005
Lecture 6 - Fei-Fei Li, Jonathan Krause
- Lots of feature engineering…
HOG
31 Dalal, Triggs. 2005
Lecture 6 - Fei-Fei Li, Jonathan Krause
More feature engineering
32 Dalal, Triggs. 2005
Lecture 6 - Fei-Fei Li, Jonathan Krause
But it works
33 Dalal, Triggs. 2005
avg. gradient max pos. SVM weight min neg. SVM weight HOG pos SVM weights neg SVM weights
Lecture 6 - Fei-Fei Li, Jonathan Krause
- Key idea: Train a separate SVM for each positive
training example (on HOG features!).
Exemplar SVM
34 Malisiewicz et al. 2011
Lecture 6 - Fei-Fei Li, Jonathan Krause
- Q: But wait, isn’t that going to be horribly slow?
- A: Yep! Much slower than a single SVM. No one I
know of actually uses this. However….
- Can transfer metadata (segmentations!)
Exemplar SVM
35 Malisiewicz et al. 2011
Lecture 6 - Fei-Fei Li, Jonathan Krause
Exemplar SVM Examples
36 Malisiewicz et al. 2011
Lecture 6 - Fei-Fei Li, Jonathan Krause
Exemplar SVM Examples
37 Malisiewicz et al. 2011
Lecture 6 - Fei-Fei Li, Jonathan Krause
- (sneak preview of student presentation)
- Similar to SVM on HOG, but also with parts
(latent SVM)
- State of the art for several years
Deformable Part Models
38
Lecture 6 - Fei-Fei Li, Jonathan Krause
- Evaluate classifier at many positions
- Dominant detection paradigm until ~2 years ago
- Boosting, SVM, and DPM
Sliding Window Summary
39
Lecture 6 - Fei-Fei Li, Jonathan Krause
- 1. Sliding Window Methods
- 2. Region-based Methods
1. Motivation 2. Region Proposals 3. R-CNN
- 3. Extra Topics
Outline
40
Lecture 6 - Fei-Fei Li, Jonathan Krause
Sliding Window Problem: Efficiency
41
Q: How many bounding boxes in this 482 x 348 image? A: 6,999,078,138 (7 trillion)
Lecture 6 - Fei-Fei Li, Jonathan Krause 42
Can’t classify 7 trillion windows, even millions is slow. Can we massively cut down this number (e.g. 1000s)?
Sliding Window Problem: Efficiency
Lecture 6 - Fei-Fei Li, Jonathan Krause
Detection on Regions
43
- Generate detection proposals (typically ~2000)
- Classify each region with a much stronger classifier
- More or less taken over modern detection
van de Sande et al., 2011
Lecture 6 - Fei-Fei Li, Jonathan Krause
Region Proposals
44
- Sliding window or grouping pixels
- May or may not output score
- Varying amount of control over number of regions
“What makes for effective detection proposals?”. Hosang, Benenson, Dollar, Schiele. 2015
Lecture 6 - Fei-Fei Li, Jonathan Krause
- Sliding window
- Score based on a bunch of heuristic features
Objectness
45 Alexe, Deselares, Ferrari. 2010
Lecture 6 - Fei-Fei Li, Jonathan Krause
- Felzenszwalb superpixels
- Merge based on color features
- Most common method in use
Selective Search
46 van de Sande et al., 2011
Lecture 6 - Fei-Fei Li, Jonathan Krause
- Structured decision forest for object boundaries
- Coarse sliding windows with location refinement
- Seems fast and accurate, but time will tell
Edge Boxes
47 Zitnick, Dollar. 2014
Lecture 6 - Fei-Fei Li, Jonathan Krause
- What fraction of ground truth bounding boxes
do they recover?
- How many proposals does it take?
- At what IoU overlap threshold?
Evaluating Region Proposals
48 “What makes for effective detection proposals?”. Hosang, Benenson, Dollar, Schiele. 2015
Lecture 6 - Fei-Fei Li, Jonathan Krause
- Recall at IoU threshold=0.7 predicts detection
performance well
- Most people use ~2000 regions produced with
Selective Search (a few seconds/image)
- Edge Boxes looks promising
In Practice
49
Lecture 6 - Fei-Fei Li, Jonathan Krause
- Most detectors, region proposal methods in
particular, reduce detection to repeated classification
- Let’s take a look at a few key ideas in
classification
Aside: Classification
50
Lecture 6 - Fei-Fei Li, Jonathan Krause
Descriptors
Classification: Bag of Words
51
Codebook
Offline: Cluster descriptors in training images
Histogram SVM
frequency
Note: No spatial information
Lecture 6 - Fei-Fei Li, Jonathan Krause
Classification: Spatial Pyramid
52
big SVM
Lazebnik et al. 2006
Lecture 6 - Fei-Fei Li, Jonathan Krause
Classification
53
- Sparse Coding (LLC: Locality constrained Linear
Coding)
- Represent descriptor with more than one codeword
- Fisher Vectors
- Represent difference between descriptor and
codewords (very roughly)
- A little better, still used sometimes
Wang et al. 2010 Perronnin et al. 2010
Lecture 6 - Fei-Fei Li, Jonathan Krause
2012
54
- In 2012 neural networks started working
[Krizhevsky et al. 2012]
Russakovsky et al. 2015
Lecture 6 - Fei-Fei Li, Jonathan Krause
Neural Nets
55
- Learn the whole pipeline (pixels to classes)
from scratch.
- Many layers of (learned) intermediate features
- Will see more in student presentation
Krizhevsky et al. 2012
Lecture 6 - Fei-Fei Li, Jonathan Krause
- R-CNN = Selective Search + CNN
- That’s it.
R-CNN
56 Girshick et al. 2014
Lecture 6 - Fei-Fei Li, Jonathan Krause
- Need region to fit input size of CNN
- Region warping method:
R-CNN Details
57 Girshick et al. 2014
add context region pad with zero warp works the best
Lecture 6 - Fei-Fei Li, Jonathan Krause
- Context around region
- 0 or 16 pixels (in CNN reference frame)
R-CNN Details
58 Girshick et al. 2014
region works the best 16
Lecture 6 - Fei-Fei Li, Jonathan Krause
- CNN Layer is important
- fc6 best?
R-CNN Details
59 Girshick et al. 2014
Lecture 6 - Fei-Fei Li, Jonathan Krause
- fine-tuning on PASCAL (CNN trained on ILSVRC)
- It helps, and may make another layer better
R-CNN Details
60 Girshick et al. 2014
Lecture 6 - Fei-Fei Li, Jonathan Krause
- Bounding box regression
- Regress from CNN features to bounding box
- Helps quite a bit
R-CNN Details
61 Girshick et al. 2014
Lecture 6 - Fei-Fei Li, Jonathan Krause
- Train SVM on top of CNN features
- Be careful about which are positives and which
are negatives (use the IoU overlap!)
- Hard negative mining for efficiency.
R-CNN Details
62 Girshick et al. 2014
Lecture 6 - Fei-Fei Li, Jonathan Krause
- 1. Sliding Window Methods
- 2. Region-based Methods
- 3. Extra Topics
Outline
63
Lecture 6 - Fei-Fei Li, Jonathan Krause
- Typically done with Average Precision (AP)
- When considering multiple classes, use mean
(across classes) Average Precision (mAP)
Evaluation
64
Lecture 6 - Fei-Fei Li, Jonathan Krause
- Surroundings can provide information
- Many methods use a weak version of this
Context
65
What object is this?
Lecture 6 - Fei-Fei Li, Jonathan Krause
- Turn multiple detections into one
- Common approach: merge bounding boxes with
>= 0.5 (or some threshold) IoU, keep the higher scoring box.
Non-maximal Suppression
66
Lecture 6 - Fei-Fei Li, Jonathan Krause
- Efficient sliding windows with CNNs
OverFeat
67 Sermanet et al. 2013
Lecture 6 - Fei-Fei Li, Jonathan Krause
- Very new, reuses most CNN computation across
regions
Fast R-CNN
68
- Girshick. 2015
Lecture 6 - Fei-Fei Li, Jonathan Krause
- Try to learn the region proposals
Multibox
69 Erhan et al. 2014
Lecture 6 - Fei-Fei Li, Jonathan Krause
- 20 Object Categories, thousands of images
- 2007-2012
- Was the dataset for a long time.
Detection Challenges: PASCAL
70
Lecture 6 - Fei-Fei Li, Jonathan Krause
- 200 Object Categories, 100,000s of images
- 2013-current
- Not all images fully annotated.
Detection Challenges: ILSVRC
71