Objec bject t detec detecti tion on CV3DST | Prof. Leal-Taix 1 - - PowerPoint PPT Presentation

objec bject t detec detecti tion on
SMART_READER_LITE
LIVE PREVIEW

Objec bject t detec detecti tion on CV3DST | Prof. Leal-Taix 1 - - PowerPoint PPT Presentation

Objec bject t detec detecti tion on CV3DST | Prof. Leal-Taix 1 Ta Task k defini niti tion Object detection problem Bounding box. (x,y,w,h) h (x,y) w CV3DST | Prof. Leal-Taix 2 Ta Task k defini niti tion Object


slide-1
SLIDE 1

Objec bject t detec detecti tion

  • n

1 CV3DST | Prof. Leal-Taixé

slide-2
SLIDE 2

Ta Task k defini niti tion

  • Object detection problem

2

(x,y) w h Bounding box. (x,y,w,h)

CV3DST | Prof. Leal-Taixé

slide-3
SLIDE 3

Ta Task k defini niti tion

  • Object detection problem

3

Bounding box. (x,y,w,h) + class

CV3DST | Prof. Leal-Taixé

slide-4
SLIDE 4

A A bit it of f his history

4 CV3DST | Prof. Leal-Taixé

slide-5
SLIDE 5

Tr Traditi tiona nal l object ct dete tecti ction n metho thods

  • 1. Template matching + sliding window

5

Image Template

CV3DST | Prof. Leal-Taixé

slide-6
SLIDE 6

Tr Traditi tiona nal l object ct dete tecti ction n metho thods

  • 1. Template matching + sliding window

6

Image

CV3DST | Prof. Leal-Taixé

slide-7
SLIDE 7

Tr Traditi tiona nal l object ct dete tecti ction n metho thods

  • 1. Template matching + sliding window

7

Image For every position you evaluate how much do the pixels in the image and template correlate LOW correlation

CV3DST | Prof. Leal-Taixé

slide-8
SLIDE 8

Tr Traditi tiona nal l object ct dete tecti ction n metho thods

  • 1. Template matching + sliding window

8

Image For every position you evaluate how much do the pixels in the image and template correlate HIGH correlation

CV3DST | Prof. Leal-Taixé

slide-9
SLIDE 9

Tr Traditi tiona nal l object ct dete tecti ction n metho thods

  • Problems of 1. Template matching + sliding window

9

Image For every position you evaluate how much do the pixels in the image and template correlate LOW correlation

CV3DST | Prof. Leal-Taixé

slide-10
SLIDE 10

Tr Traditi tiona nal l object ct dete tecti ction n metho thods

  • Problems of 1. Template matching + sliding window

– Occlusions: we need to see the WHOLE object – This works to detect a given in instance of an object but not a cl clas ass of objects

10

Appearance and shape changes Pose changes

CV3DST | Prof. Leal-Taixé

slide-11
SLIDE 11

Tr Traditi tiona nal l object ct dete tecti ction n metho thods

  • Problems of 1. Template matching + sliding window

– Occlusions: we need to see the WHOLE object – This works to detect a given in instance of an object but not a cl clas ass of objects – Objects have an unknown position, scale and aspect ratio, the search space is searched inefficiently with sliding window

11 CV3DST | Prof. Leal-Taixé

slide-12
SLIDE 12

Tr Traditi tiona nal l object ct dete tecti ction n metho thods

  • 2. Feature extraction + classification

12 CV3DST | Prof. Leal-Taixé

slide-13
SLIDE 13

Vi Viol

  • la-Jon
  • nes

es det detec ector

  • r
  • 2. Feature extraction + classification

– Learning multiple weak learners to build a strong classifier – That is, make many small decisions and combine them for a stronger final decision

13

Viola and Jones. Rapid object detection using a boosted cascade of simple features. CVPR 2001.

CV3DST | Prof. Leal-Taixé

slide-14
SLIDE 14

Vi Viol

  • la-Jon
  • nes

es det detec ector

  • r
  • 2. Feature extraction + classification

14

Viola and Jones. Rapid object detection using a boosted cascade of simple features. CVPR 2001.

Haar features

CV3DST | Prof. Leal-Taixé

slide-15
SLIDE 15

Vi Viol

  • la-Jon
  • nes

es det detec ector

  • r
  • 2. Feature extraction + classification

– Step 1: Select your Haar-like features – Step 2: Integral image for fast feature evaluation

  • I can evaluate which parts of the image have highest cross-

correlation with my feature (template)

– Step 3: AdaBoost for to find weak learner

  • I cannot possibly evaluate all features at test time for all

image locations

  • Learn the best set of weak learners
  • Our final classifier is the linear combination of all weak

learners

15

Viola and Jones. Rapid object detection using a boosted cascade of simple features. CVPR 2001.

CV3DST | Prof. Leal-Taixé

slide-16
SLIDE 16

Vi Viol

  • la-Jon
  • nes

es det detec ector

  • r

16

Viola and Jones. Rapid object detection using a boosted cascade of simple features. CVPR 2001.

CV3DST | Prof. Leal-Taixé

slide-17
SLIDE 17

Hist Histogram o m of O Orie iented G Gradie ients

  • 2. Feature extraction + classification

17

Average gradient image over training samples à gradients provide shape information. Let us create a descriptor that exploits that. Gradient: blue arrows show the gradient, i.e., the direction of greatest change of the image.

Dalal and Triggs. Histogram of oriented gradients for human detection. CVPR 2005.

CV3DST | Prof. Leal-Taixé

slide-18
SLIDE 18

Hist Histogram o m of O Orie iented G Gradie ients

  • 2. Feature extraction + classification

18

HOG descriptor à Histogram of oriented gradients. Compute gradients in dense grids, compute gradients and create a histogram based on gradient direction.

Dalal and Triggs. Histogram of oriented gradients for human detection. CVPR 2005.

CV3DST | Prof. Leal-Taixé

slide-19
SLIDE 19

Hist Histogram o m of O Orie iented G Gradie ients

  • 2. Feature extraction + classification

– Step 1: Choose your training set of images that contain the object you want to detect. – Step 2: Choose a set of images that do NOT contain that

  • bject.

– Step 3: Extract HOG features on both sets. – Step 4: Train an SVM classifier on the two sets to detect whether a feature vector represents the object of interest

  • r not (0/1 classification).

19

Dalal and Triggs. Histogram of oriented gradients for human detection. CVPR 2005.

CV3DST | Prof. Leal-Taixé

slide-20
SLIDE 20

Hist Histogram o m of O Orie iented G Gradie ients

  • 2. Feature extraction + classification

20

HOG features weighted by the positive SVM weights – the ones used for the pedestrian object classifier.

Dalal and Triggs. Histogram of oriented gradients for human detection. CVPR 2005.

CV3DST | Prof. Leal-Taixé

slide-21
SLIDE 21

De Deformable ble Pa Part t Model

  • Also based on HOG features, but based on body part

detection à more robust to different body poses

21

Felzenszwalb et al. A discriminatively trained, multiscale, deformable part model. CVPR 2008.

CV3DST | Prof. Leal-Taixé

slide-22
SLIDE 22

Ho How t to m move towards general l

  • bject
  • bject detecti

detection

  • n?

22 CV3DST | Prof. Leal-Taixé

slide-23
SLIDE 23

Wh What at def defines es an an ob

  • bjec

ject?

  • We need a generic, clas

ass-ag agno nostic objectness measure: how likely it is for an image region to contain an object

23

Very likely to be an object Maybe it is an

  • bject

CV3DST | Prof. Leal-Taixé

slide-24
SLIDE 24

Wh What at def defines es an an ob

  • bjec

ject?

  • We need a generic, clas

ass-ag agno nostic objectness measure: how likely it is for an image region to contain an object

  • Using this measure yields a number of candidate
  • bject proposal

als or regions ns of int nterest (Ro RoI) where to focus.

24

+ classifier

CV3DST | Prof. Leal-Taixé

slide-25
SLIDE 25

Obj Object ct pro propo posal l methods

  • Selective sear

arch: van de Sande et al. Segmentation as selective search for object recognition. ICCV 2011.

  • Ed

Edge boxes: Zitnick and Dollar. Edge boxes: locating

  • bject proposals from edges. ECCV 2014.

25 CV3DST | Prof. Leal-Taixé

slide-26
SLIDE 26

Do Do we want t all ll pr propo posals ls?

  • Many boxes trying to explain one object
  • We need a method to keep only the “best” boxes

26 CV3DST | Prof. Leal-Taixé

slide-27
SLIDE 27

No Non-Max Maximum Suppres ession

  • n (N

(NMS MS)

  • Many boxes trying to explain one object
  • We need a method to keep only the “best” boxes

27 CV3DST | Prof. Leal-Taixé

slide-28
SLIDE 28

No Non-Max Maximum Suppres ession

  • n (N

(NMS MS)

28

Start with anchor box i For another box j If they overlap Discard box i if the score is lower than the score of j Overlap = to be defined Score = depends on the task

CV3DST | Prof. Leal-Taixé

slide-29
SLIDE 29

Reg Region

  • n ov
  • ver

erlap ap

29

  • We measure region overlap with the Int

ntersection n

  • ver Uni

nion n (Io IoU) or Jac Jaccar ard Ind ndex: J(A, B) = |A ∩ B| |A ∪ B| A B A B A B

Intersection Union

CV3DST | Prof. Leal-Taixé

slide-30
SLIDE 30

No Non-Max Maximum Suppres ession

  • n (N

(NMS MS)

30

Start with anchor box i For another box j If they overlap Discard box i if the score is lower than the score of j Overlap = to be defined Score = depends on the task

CV3DST | Prof. Leal-Taixé

slide-31
SLIDE 31

NM NMS: t the pro probl blem

31

Hosang, Benenson and Schiele. A Convnet for Non-Maximum Suppression. 2015

Ground truth positions

CV3DST | Prof. Leal-Taixé

slide-32
SLIDE 32

NM NMS: t the pro probl blem

  • Choosing a narrow threshold

32

Hosang, Benenson and Schiele. A Convnet for Non-Maximum Suppression. 2015

Ground truth positions False positives Low Precision

CV3DST | Prof. Leal-Taixé

slide-33
SLIDE 33

NM NMS: t the pro probl blem

  • Choosing a wider threshold

33

Hosang, Benenson and Schiele. A Convnet for Non-Maximum Suppression. 2015

Ground truth position False positive Low Recall False negative

CV3DST | Prof. Leal-Taixé

slide-34
SLIDE 34

No Non-Max Maximum Suppres ession

  • n (N

(NMS MS)

  • NMS will be used at test time. Most detection

methods (even Deep Learning ones) use NMS!

34 CV3DST | Prof. Leal-Taixé

slide-35
SLIDE 35

Le Lear arning ning-ba based sed detec detectors tors

38 CV3DST | Prof. Leal-Taixé

slide-36
SLIDE 36

Ty Types of object ct dete tecto ctors

  • One-stage detectors
  • Two-stage detectors

39

Feature extraction Extraction of

  • bject

proposals Classification Localization Class score (cat, dog, person) Refine bounding box (Δx, Δy, Δw, Δh) Image Feature extraction Classification Localization Class score (cat, dog, person) Bounding box (x,y,w,h) Image

CV3DST | Prof. Leal-Taixé

slide-37
SLIDE 37

Ty Types of object ct dete tecto ctors

  • One-stage detectors

– YOLO, SSD, RetinaNet – CenterNet, CornerNet, ExtremeNet

  • Two-stage detectors

– R-CNN, Fast R-CNN, Faster R-CNN – SPP-Net, R-FCN, FPN

40 CV3DST | Prof. Leal-Taixé

slide-38
SLIDE 38

Objec bject t detec detecti tion

  • n

41 CV3DST | Prof. Leal-Taixé