Objec bject t detec detecti tion
- n
1 CV3DST | Prof. Leal-Taixé
Objec bject t detec detecti tion on CV3DST | Prof. Leal-Taix 1 - - PowerPoint PPT Presentation
Objec bject t detec detecti tion on CV3DST | Prof. Leal-Taix 1 Ta Task k defini niti tion Object detection problem Bounding box. (x,y,w,h) h (x,y) w CV3DST | Prof. Leal-Taix 2 Ta Task k defini niti tion Object
1 CV3DST | Prof. Leal-Taixé
2
(x,y) w h Bounding box. (x,y,w,h)
CV3DST | Prof. Leal-Taixé
3
Bounding box. (x,y,w,h) + class
CV3DST | Prof. Leal-Taixé
4 CV3DST | Prof. Leal-Taixé
5
Image Template
CV3DST | Prof. Leal-Taixé
6
Image
CV3DST | Prof. Leal-Taixé
7
Image For every position you evaluate how much do the pixels in the image and template correlate LOW correlation
CV3DST | Prof. Leal-Taixé
8
Image For every position you evaluate how much do the pixels in the image and template correlate HIGH correlation
CV3DST | Prof. Leal-Taixé
9
Image For every position you evaluate how much do the pixels in the image and template correlate LOW correlation
CV3DST | Prof. Leal-Taixé
– Occlusions: we need to see the WHOLE object – This works to detect a given in instance of an object but not a cl clas ass of objects
10
Appearance and shape changes Pose changes
CV3DST | Prof. Leal-Taixé
– Occlusions: we need to see the WHOLE object – This works to detect a given in instance of an object but not a cl clas ass of objects – Objects have an unknown position, scale and aspect ratio, the search space is searched inefficiently with sliding window
11 CV3DST | Prof. Leal-Taixé
12 CV3DST | Prof. Leal-Taixé
– Learning multiple weak learners to build a strong classifier – That is, make many small decisions and combine them for a stronger final decision
13
Viola and Jones. Rapid object detection using a boosted cascade of simple features. CVPR 2001.
CV3DST | Prof. Leal-Taixé
14
Viola and Jones. Rapid object detection using a boosted cascade of simple features. CVPR 2001.
Haar features
CV3DST | Prof. Leal-Taixé
– Step 1: Select your Haar-like features – Step 2: Integral image for fast feature evaluation
correlation with my feature (template)
– Step 3: AdaBoost for to find weak learner
image locations
learners
15
Viola and Jones. Rapid object detection using a boosted cascade of simple features. CVPR 2001.
CV3DST | Prof. Leal-Taixé
16
Viola and Jones. Rapid object detection using a boosted cascade of simple features. CVPR 2001.
CV3DST | Prof. Leal-Taixé
17
Average gradient image over training samples à gradients provide shape information. Let us create a descriptor that exploits that. Gradient: blue arrows show the gradient, i.e., the direction of greatest change of the image.
Dalal and Triggs. Histogram of oriented gradients for human detection. CVPR 2005.
CV3DST | Prof. Leal-Taixé
18
HOG descriptor à Histogram of oriented gradients. Compute gradients in dense grids, compute gradients and create a histogram based on gradient direction.
Dalal and Triggs. Histogram of oriented gradients for human detection. CVPR 2005.
CV3DST | Prof. Leal-Taixé
– Step 1: Choose your training set of images that contain the object you want to detect. – Step 2: Choose a set of images that do NOT contain that
– Step 3: Extract HOG features on both sets. – Step 4: Train an SVM classifier on the two sets to detect whether a feature vector represents the object of interest
19
Dalal and Triggs. Histogram of oriented gradients for human detection. CVPR 2005.
CV3DST | Prof. Leal-Taixé
20
HOG features weighted by the positive SVM weights – the ones used for the pedestrian object classifier.
Dalal and Triggs. Histogram of oriented gradients for human detection. CVPR 2005.
CV3DST | Prof. Leal-Taixé
detection à more robust to different body poses
21
Felzenszwalb et al. A discriminatively trained, multiscale, deformable part model. CVPR 2008.
CV3DST | Prof. Leal-Taixé
22 CV3DST | Prof. Leal-Taixé
ass-ag agno nostic objectness measure: how likely it is for an image region to contain an object
23
Very likely to be an object Maybe it is an
CV3DST | Prof. Leal-Taixé
ass-ag agno nostic objectness measure: how likely it is for an image region to contain an object
als or regions ns of int nterest (Ro RoI) where to focus.
24
+ classifier
CV3DST | Prof. Leal-Taixé
arch: van de Sande et al. Segmentation as selective search for object recognition. ICCV 2011.
Edge boxes: Zitnick and Dollar. Edge boxes: locating
25 CV3DST | Prof. Leal-Taixé
26 CV3DST | Prof. Leal-Taixé
27 CV3DST | Prof. Leal-Taixé
28
Start with anchor box i For another box j If they overlap Discard box i if the score is lower than the score of j Overlap = to be defined Score = depends on the task
CV3DST | Prof. Leal-Taixé
29
ntersection n
nion n (Io IoU) or Jac Jaccar ard Ind ndex: J(A, B) = |A ∩ B| |A ∪ B| A B A B A B
Intersection Union
CV3DST | Prof. Leal-Taixé
30
Start with anchor box i For another box j If they overlap Discard box i if the score is lower than the score of j Overlap = to be defined Score = depends on the task
CV3DST | Prof. Leal-Taixé
31
Hosang, Benenson and Schiele. A Convnet for Non-Maximum Suppression. 2015
Ground truth positions
CV3DST | Prof. Leal-Taixé
32
Hosang, Benenson and Schiele. A Convnet for Non-Maximum Suppression. 2015
Ground truth positions False positives Low Precision
CV3DST | Prof. Leal-Taixé
33
Hosang, Benenson and Schiele. A Convnet for Non-Maximum Suppression. 2015
Ground truth position False positive Low Recall False negative
CV3DST | Prof. Leal-Taixé
methods (even Deep Learning ones) use NMS!
34 CV3DST | Prof. Leal-Taixé
38 CV3DST | Prof. Leal-Taixé
39
Feature extraction Extraction of
proposals Classification Localization Class score (cat, dog, person) Refine bounding box (Δx, Δy, Δw, Δh) Image Feature extraction Classification Localization Class score (cat, dog, person) Bounding box (x,y,w,h) Image
CV3DST | Prof. Leal-Taixé
– YOLO, SSD, RetinaNet – CenterNet, CornerNet, ExtremeNet
– R-CNN, Fast R-CNN, Faster R-CNN – SPP-Net, R-FCN, FPN
40 CV3DST | Prof. Leal-Taixé
41 CV3DST | Prof. Leal-Taixé