Learning-based Contour Detection & Contour-based Object Detection - - PowerPoint PPT Presentation

learning based contour detection contour based object
SMART_READER_LITE
LIVE PREVIEW

Learning-based Contour Detection & Contour-based Object Detection - - PowerPoint PPT Presentation

1 Learning-based Contour Detection &


slide-1
SLIDE 1

1

  • Learning-based Contour Detection & Contour-based Object Detection

Iasonas Kokkinos 21 January, 2011 Visual Geometry Group, Oxford Galen Group INRIA-Saclay Department of Applied Mathematics Ecole Centrale de Paris

slide-2
SLIDE 2

2

  • Talk outline

Boundary Detection (35’) Logistic regression and Anyboost F-measure Boosting MIL and boundary detection Monte Carlo approximations for large-scale datasets Object Detection (15’) Monte Carlo approximations for large-scale datasets Appearance descriptors and boundary detection Coarse-to-fine inference (parsing) Model learning

slide-3
SLIDE 3

3

  • Image Contours

Object/Surface Boundaries (edges)

slide-4
SLIDE 4

4

  • Image Contours

Symmetry axes (ridges/valleys)

slide-5
SLIDE 5

5

  • A biref anlaogy wtih txet

Waht mttares is waht hppaens on wrod bandouries Mocpera iwht htsi (compare with this) Concrete evidence that our visual system employs boundary detection Contour-based approaches: shape matching, segmentation, recognition,..

slide-6
SLIDE 6

6

  • How can we detect boundaries?

Filtering approaches

Canny (1984), Morrone and Owens (1987), Perona and Malik (1991),..

Scale-Space approaches

Tony Lindeberg `Edge Detection and Ridge Detection with Automatic Scale Selection.’, IJCV, 30(2), 117-156, (1998) Witkin, A. P. "Scale-space filtering", IJCAI (1983)

Variational approaches

  • V. Caselles, R. Kimmel, G. Sapiro: Geodesic Active Contours. IJCV22(1): 61-79 (1997)
  • K. Siddiqi, Y. Lauzière, A. Tannenbaum, S. Zucker: Area and length minimizing flows

for shape segmentation. IEEE TIP 7(3): 433-443 (1998)

Gestalt-based approaches

Agnès Desolneux, Lionel Moisan, Jean-Michel Morel: Meaningful

  • Alignments. International Journal of Computer Vision 40(1): 7-23 (2000)
  • M. Kass, A. Witkin and D. Terzopoulos, `Snakes: Active Contour Models’, ICCV (1987)
slide-7
SLIDE 7

7

  • Learning-based approaches

Boundary or non-boundary? Use human-annotated segmentations

  • D. Martin, C. Fowlkes, J. Malik. "Learning to Detect Natural Image Boundaries Using Local Brightness, Color and Texture

Cues", IEEE PAMI, 2004

  • S. Konishi, A.Yuille, J. Coughlan, S.C. Zhu, “Statistical Edge Detection: Learning and Evaluating Edge Cues”, IEEE PAMI,

2003

Use human-annotated segmentations

slide-8
SLIDE 8

8

  • Progress during the last 40 years

Canny+ Hysteresis Berkeley PB, ‘04 Berkeley gPb, ‘08 Humans Prewitt, 1965

slide-9
SLIDE 9

9

  • θ

r (x,y)

A closer look into gPb: features

Local features (Pb, 2004) Global features (gPb, 2008)

N-Cuts eigenvectors

In specific:

slide-10
SLIDE 10

10

  • A closer look into gPb: classifier

Logistic regression

slide-11
SLIDE 11

11

  • Talk outline

Boundary Detection (35’) Logistic regression and Anyboost F-measure Boosting MIL and boundary detection Monte Carlo approximations for large-scale datasets Object Detection (15’) Monte Carlo approximations for large-scale datasets Appearance descriptors and boundary detection Coarse-to-fine inference (parsing) Model learning

slide-12
SLIDE 12

12

  • Wanted: `simple’

that `works well’ on

Learning

Given: Training set of feature-label pairs `simple’: quantified by VC dimension, curvature,… `works well’: quantified by loss criterion

slide-13
SLIDE 13

13

  • Logistic regression

Linear function: Log-likelihood of training pair: Loss function: Optimization: Newton-Raphson (IRLS)

slide-14
SLIDE 14

14

  • At each round, add optimal pair

Anyboost

Additive form: See training cost as function of Steepest descent direction: Find `closest’ to Adaboost: exponential loss sign weight

slide-15
SLIDE 15

15

  • Side-by-side

Anyboost Logistic regression

Additive Linear Summands: features Summands: weak learners Summands: features fixed Summands: weak learners added `on the fly’ Cost: minus label log likelihood Cost: exponential loss (Adaboost)

  • : Coordinate descent
  • : Newton-Raphson

Connections: M. Collins, R. Schapire, Y. Singer `Logistic Regression, AdaBoost and Bregman Distances’ COLT (2000)

slide-16
SLIDE 16

16

  • A compact combination

Additive

  • (linear part)

Goal: quick classification, using small (e.g. ) feature set. Remaining summands: weak learners (nonlinearities) Cost?

  • : Newton-Raphson, at each iteration

Slower, but off-line

slide-17
SLIDE 17

17

  • Talk outline

Boundary Detection (35’) Logistic regression and Boosting, Anyboost F-measure Boosting MIL and boundary detection Monte Carlo approximations for large-scale datasets Object Detection (15’) Monte Carlo approximations for large-scale datasets Appearance descriptors and boundary detection Coarse-to-fine inference (parsing) Model learning

slide-18
SLIDE 18

18

  • Classifier

Loss

Cost function for training

Training set additive additive

  • but also potentially non-convex (local optimality)
  • potentially better suited for the problem

non-additive: F-measure, Area Under Curve (AUC),…

  • M. Ranjbar, G. Mori and Y. Wang `Optimizing Complex Loss Functions in Structured Prediction’ ECCV, 2010
  • T. Joachims, `A Support Vector Method for Multivariate Performance Measures’, ICML, 2005
  • M. Jansche, `Maximum Expected F-Measure Training Of Logistic Regression Models’, EMNLP, 2005
slide-19
SLIDE 19

19

  • F-measure

no reward for true negative decisions Predicted label Goal: deal with unbalanced datasets (many negative) F-measure: geometric mean of precision and recall false alarms true positives misses precision recall

slide-20
SLIDE 20

20

  • F-measure approximation

predicted label differentiable approximation approximate F-measure

  • M. Jansche, ‘Maximum Expected F-Measure Training Of Logistic Regression Models’, EMNLP, 2005
slide-21
SLIDE 21

21

  • function of responses

Anyboost

F-measure optimization via Anyboost

Previous iteration Loss Anyboost Newton-Raphson for coefficients: Jansche’s paper

slide-22
SLIDE 22

22

  • Talk outline

Boundary Detection (35’) Logistic regression and Boosting, Anyboost F-measure Boosting MIL and boundary detection Monte Carlo approximations for large-scale datasets Object Detection (15’) Monte Carlo approximations for large-scale datasets Appearance descriptors and boundary detection Coarse-to-fine inference (parsing) Model learning

slide-23
SLIDE 23

23

  • mom’s keychain

Sneaking into the fun room

dad’s keychain grandma’s keychain We know that dad cannot enter the fun room, either Which key should we try?

Slide Credit: B. Babenko/T. Dietterich

slide-24
SLIDE 24

24

  • Multiple Instance Learning

Typical Learning Multiple Instance Learning

Slide Credit: K. Grauman

Typical Learning Multiple Instance Learning Positive bag: at least one instance should be positive Negative bag: no instance should be positive

slide-25
SLIDE 25

25

  • Problem I: inconsistent orientation information

MIL and boundary detection

Problem II: inconsistent location information

r

Form bag of image locations/orientations that can`support’ human boundary Given orientation, location support: Overall support for boundary at

slide-26
SLIDE 26

26

  • Anyboost for F-measure boosting (previous section)

function of Loss

slide-27
SLIDE 27

27

  • function of

Loss

Anyboost for MIL & F-measure boosting

slide-28
SLIDE 28

28

  • Talk outline

Boundary Detection (35’) Logistic regression and Boosting, Anyboost F-measure Boosting MIL and boundary detection Monte Carlo approximations for large-scale datasets Object Detection (15’) Monte Carlo approximations for large-scale datasets Appearance descriptors and boundary detection Coarse-to-fine inference (parsing) Model learning

slide-29
SLIDE 29

29

  • Weak learner selection

In all cases:

slide-30
SLIDE 30

30

  • Gains so far

0.7 0.8 0.9 1

Effect of Training

Precision

0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.2 0.3 0.4 0.5 0.6

Recall Precision

Global PB, F = 0.697 MIL + Full training set, F = 0.704 MIL + Full training set + Boosting, F = 0.711

slide-31
SLIDE 31

31

  • Talk outline

Boundary Detection (35’) Logistic regression and Boosting, Anyboost F-measure Boosting MIL and boundary detection Monte Carlo approximations for large-scale datasets Object Detection (15’) Monte Carlo approximations for large-scale datasets Appearance descriptors and boundary detection Coarse-to-fine inference (parsing) Model learning

slide-32
SLIDE 32

32

  • Appearance Descriptors

Dense descriptors (DAISY-like) Multi-scale Gaussian & Gabors, Infinite Impulse Response implementations Goal: capture context for boundary detection

slide-33
SLIDE 33

33

  • Discriminative dimensionality reduction

Squeeze discriminative information out of high-dimensional descriptor LDA: only 1-D (2 class separation) Large Margin Nearest Neighbors, Neighborhood Component Analysis, ... iterative, work with <104 features Partial Least Squares: iterative `Spliced Average Variance Estimation’ (SAVE)

R.D. Cook, SAVE: a method for dimension reduction and graphics in regression, Comm. Statist. Theory Methods 29 (2000)

Find B, dxN, d<N such that: P(y|Bx,x) = P(y|Bx) Algorithm:

slide-34
SLIDE 34

34

  • SAVE

PCA

PCA vs SAVE (for SIFT features)

34

Projections from SAVE

Scale 1

Scale 2

Scale 3

Projections from PCA

slide-35
SLIDE 35

35

  • Dense descriptor projections
slide-36
SLIDE 36

36

  • Overall gains

0.7 0.8 0.9 1

Effect of features

Precision

0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.2 0.3 0.4 0.5 0.6

Recall Precision

Global PB, F = 0.697 MIL + Full training set + Boosting, F = 0.711 DoG + LoG + Gabor, F = 0.719 Context + DoG + LoG + Gabor, F = 0.726

slide-37
SLIDE 37

37

  • P

Comparisons with gPb

Global Pb P>.5

slide-38
SLIDE 38

38

  • Talk outline

Boundary Detection (35’) Logistic regression and Boosting, Anyboost F-measure Boosting MIL and boundary detection Monte Carlo approximations for large-scale datasets Object Detection (15’) Monte Carlo approximations for large-scale datasets Coarse-to-fine inference (parsing) Model learning Appearance descriptors and boundary detection

slide-39
SLIDE 39

39

  • Where can contours be useful?

Recognition? But contours are highly redundant (only junctions/corners/endings matter) Attneave 1967 Contours carry most of the image information But corners/blobs/junctions are hard to group post-hoc

slide-40
SLIDE 40

40

  • Parts

Object

Hierarchical Compositional Models

Contours Tokens

Iasonas Kokkinos and Alan Yuille, Inference and Learning with Hierarchical Shape Models Int.l Journal of Computer Vision (IJCV), to appear

slide-41
SLIDE 41

41

  • View production rules as composition rules

Build a parse tree for the object

  • Compositional Object Detection
slide-42
SLIDE 42

42

  • Composition of the `back’ structure

Problem: Too many options! (Combinatorial explosion)

slide-43
SLIDE 43

43

  • A* Search
  • A* for object parsing
  • How can we extend A* to parsing?

– `The Generalized A* Architecture’, P. Felzenszwalb and D. McAllester, JAIR, 2007

  • How can we apply A* parsing to object detection?

– ‘HOP: Hierarchical Object Parsing’, I. Kokkinos and A. Yuille, CVPR 2009

43

slide-44
SLIDE 44

44

  • Coarse-level parsing
slide-45
SLIDE 45

45

  • Fine-level parsing
slide-46
SLIDE 46

46

  • !

"! #! $%&

  • A* vs Knuth’s Lightest Derivation (DP)

!

slide-47
SLIDE 47

47

  • Talk outline

Boundary Detection (35’) Logistic regression and Boosting, Anyboost F-measure Boosting MIL and boundary detection Monte Carlo approximations for large-scale datasets Object Detection (15’) Monte Carlo approximations for large-scale datasets Coarse-to-fine inference(parsing) Model learning Appearance descriptors and boundary detection

slide-48
SLIDE 48

48

  • Input: a set of unregistered images containing object
  • Output: a hierarchical model and parsing cost criterion

Learning problem

  • Learning pipeline

– Contours – Parts – Cost

slide-49
SLIDE 49

49

  • '

()'*

Deformable model

  • ""#$
  • %$&$'())""
slide-50
SLIDE 50

50

  • s

T M: Update E: Deform Edges & Ridges Input Images

++",

Learning deformable models

S T AAM Fit

  • I. Kokkinos and A. Yuille, Unsupervised Learning of Object Deformation Models, ICCV 2007
slide-51
SLIDE 51

51

  • Recovering object contours
slide-52
SLIDE 52

52

  • Recovering object contours- ETHZ Shapes
slide-53
SLIDE 53

53

  • Recovering object parts

Perceptual grouping-based graph Affinity propagation results

slide-54
SLIDE 54

54

  • Recovering object parts – ETHZ Shapes
slide-55
SLIDE 55

55

  • Goal: learn cost that leads to accurate detection

Parts Object

Discriminative cost training

– But, no manual annotations to train with – Sole information: Class labels

Parts Contours Tokens

slide-56
SLIDE 56

56

  • Parses as hidden data

MIL-based formulation

Positive bag Negative bag

slide-57
SLIDE 57

57

  • Improvements in parsing

Round 2 Round 6

Improvement of cost function: better parsing

  • P. Gehler and O. Chapelle, Deterministic Annealing for Multiple Instance Learning, AISTATS, 2007
slide-58
SLIDE 58

58

  • Improvement of cost function: better localization

Round 2 Round 6

slide-59
SLIDE 59

59

  • Parsing and localization results
slide-60
SLIDE 60

60

  • Benchmark results
slide-61
SLIDE 61

61

  • Failure cases

Front-end failures Missing appearance information/poor shape model

slide-62
SLIDE 62

62

  • Talk outline

Boundary Detection (35’) Logistic regression and Anyboost F-measure Boosting MIL and boundary detection Monte Carlo approximations for large-scale datasets Object Detection (15’) Monte Carlo approximations for large-scale datasets Appearance descriptors and boundary detection Coarse-to-fine inference (parsing) Model learning Conclusions (1’)

slide-63
SLIDE 63

63

  • Conclusion

It is not the same, indeed Results: it is not too different Future work: make it closer combine contours and appearance descriptors structured statistical models for shape integrate segmentation (symmetry)

slide-64
SLIDE 64

64

  • Thank you

Acknowledgements

M.Bronstein: slide template