Object Recognition and Scene Understanding MIT student - - PowerPoint PPT Presentation

object recognition and scene understanding mit
SMART_READER_LITE
LIVE PREVIEW

Object Recognition and Scene Understanding MIT student - - PowerPoint PPT Presentation

Object Recognition and Scene Understanding MIT student presentation 6.870 6.870 Template matching and histograms Nicolas Pinto Introduction Hosts a guy... Antonio T... a frog... (who has big arms) (who knows a lot about vision) (who


slide-1
SLIDE 1

6.870 Object Recognition and Scene Understanding

student presentation

MIT

slide-2
SLIDE 2

6.870

Template matching and histograms

Nicolas Pinto

slide-3
SLIDE 3

Introduction

slide-4
SLIDE 4

Hosts

Antonio T...

(who knows a lot about vision)

a frog...

(who has big eyes) and thus should know a lot about vision...

a guy...

(who has big arms)

slide-5
SLIDE 5

Object Recognition from Local Scale-Invariant Features

David G. Lowe Computer Science Department University of British Columbia Vancouver, B.C., V6T 1Z4, Canada lowe@cs.ubc.ca Abstract An object recognition system has been developed that uses a new class of local image features. The features are invariant to image scaling, translation,and rotation, and partially in- variant to illuminationchanges and affine or 3D projection. translation, scaling, and rotation, and partially invariant to illumination changes and affine or 3D projection. Previous approaches to local feature generation lacked invariance to scale and were more sensitive to projective distortion and illumination change. The SIFT features share a number of properties in common with the responses of neurons in infe-

Lowe (1999)

Histograms of Oriented Gradients for Human Detection

Navneet Dalal and Bill Triggs INRIA Rhˆ
  • ne-Alps, 655 avenue de l’Europe, Montbonnot 38334, France
{Navneet.Dalal,Bill.Triggs}@inrialpes.fr, http://lear.inrialpes.fr Abstract We study the question of feature sets for robust visual ob- ject recognition, adopting linear SVM based human detec- tion as a test case. After reviewing existing edge and gra- dient based descriptors, we show experimentally that grids
  • f Histograms of Oriented Gradient (HOG) descriptors sig-
nificantly outperform existing feature sets for human detec-
  • tion. We study the influence of each stage of the computation
We briefly discuss previous work on human detection in §2, give an overview of our method §3, describe our data sets in §4 and give a detailed description and experimental evaluation of each stage of the process in §5–6. The main conclusions are summarized in §7. 2 Previous Work There is an extensive literature on object detection, but here we mention just a few relevant papers on human detec-

Nalal and Triggs (2005)

3 papers

A Discriminatively Trained, Multiscale, Deformable Part Model

Pedro Felzenszwalb University of Chicago pff@cs.uchicago.edu David McAllester Toyota Technological Institute at Chicago mcallester@tti-c.org Deva Ramanan UC Irvine dramanan@ics.uci.edu Abstract This paper describes a discriminatively trained, multi- scale, deformable part model for object detection. Our sys- tem achieves a two-fold improvement in average precision
  • ver the best performance in the 2006 PASCAL person de-
tection challenge. It also outperforms the best results in the 2007 challenge in ten out of twenty categories. The system relies heavily on deformable parts. While deformable part models have become quite popular, their value had not been demonstrated on difficult benchmarks such as the PASCAL Figure 1. Example detection obtained with the person model. The model is defined by a coarse template, several higher resolution

Felzenszwalb et al. (2008)

yey!!

slide-6
SLIDE 6

Object Recognition from Local Scale-Invariant Features

David G. Lowe Computer Science Department University of British Columbia Vancouver, B.C., V6T 1Z4, Canada lowe@cs.ubc.ca Abstract An object recognition system has been developed that uses a new class of local image features. The features are invariant to image scaling, translation,and rotation, and partially in- variant to illuminationchanges and affine or 3D projection. translation, scaling, and rotation, and partially invariant to illumination changes and affine or 3D projection. Previous approaches to local feature generation lacked invariance to scale and were more sensitive to projective distortion and illumination change. The SIFT features share a number of properties in common with the responses of neurons in infe-

Lowe (1999)

Histograms of Oriented Gradients for Human Detection

Navneet Dalal and Bill Triggs INRIA Rhˆ
  • ne-Alps, 655 avenue de l’Europe, Montbonnot 38334, France
{Navneet.Dalal,Bill.Triggs}@inrialpes.fr, http://lear.inrialpes.fr Abstract We study the question of feature sets for robust visual ob- ject recognition, adopting linear SVM based human detec- tion as a test case. After reviewing existing edge and gra- dient based descriptors, we show experimentally that grids
  • f Histograms of Oriented Gradient (HOG) descriptors sig-
nificantly outperform existing feature sets for human detec-
  • tion. We study the influence of each stage of the computation
We briefly discuss previous work on human detection in §2, give an overview of our method §3, describe our data sets in §4 and give a detailed description and experimental evaluation of each stage of the process in §5–6. The main conclusions are summarized in §7. 2 Previous Work There is an extensive literature on object detection, but

Nalal and Triggs (2005)

A Discriminatively Trained, Multiscale, Deformable Part Model

Pedro Felzenszwalb University of Chicago pff@cs.uchicago.edu David McAllester Toyota Technological Institute at Chicago mcallester@tti-c.org Deva Ramanan UC Irvine dramanan@ics.uci.edu Abstract This paper describes a discriminatively trained, multi- scale, deformable part model for object detection. Our sys- tem achieves a two-fold improvement in average precision
  • ver the best performance in the 2006 PASCAL person de-
tection challenge. It also outperforms the best results in the 2007 challenge in ten out of twenty categories. The system relies heavily on deformable parts. While deformable part models have become quite popular, their value had not been demonstrated on difficult benchmarks such as the PASCAL Figure 1. Example detection obtained with the person model. The model is defined by a coarse template, several higher resolution

Felzenszwalb et al. (2008)

slide-7
SLIDE 7

Scale-Invariant Feature Transform (SIFT)

adapted from Kucuktunc
slide-8
SLIDE 8

Scale-Invariant Feature Transform (SIFT)

adapted from Brown, ICCV 2003
slide-9
SLIDE 9

SIFT local features are

invariant...

adapted from David Lee
slide-10
SLIDE 10

like me they are robust...

Text

... to changes in illumination, noise, viewpoint, occlusion, etc.

slide-11
SLIDE 11

I am sure you want to know

how to build them

Text

  • 1. find interest points or “keypoints”
  • 2. find their dominant orientation
  • 3. compute their descriptor
  • 4. match them on other images
slide-12
SLIDE 12

Text

  • 1. find interest points or “keypoints”
slide-13
SLIDE 13

Text

keypoints are taken as maxima/minima

  • f a DoG pyramid

in this settings, extremas are invariant to scale...

slide-14
SLIDE 14

a DoG (Difference of Gaussians) pyramid is simple to compute...

even him can do it! before after

adapted from Pallus and Fleishman
slide-15
SLIDE 15

then we just have to find

neighborhood extremas

in this 3D DoG space

if a pixel is an extrema in its neighboring region he becomes a candidate

keypoint

slide-16
SLIDE 16

too many keypoints?

  • 1. remove

low contrast

  • 2. remove

edges

adapted from wikipedia
slide-17
SLIDE 17

Text

  • 2. find their dominant orientation
slide-18
SLIDE 18

each selected keypoint is assigned to one or more “dominant” orientations... ... this step is important to achieve rotation invariance

slide-19
SLIDE 19

How?

using the DoG pyramid to achieve scale invariance:

  • a. compute image gradient

magnitude and orientation

  • b. build an orientation histogram
  • c. keypoint’s orientation(s) = peak(s)
slide-20
SLIDE 20
  • a. compute image gradient

magnitude and orientation

slide-21
SLIDE 21
  • b. build an orientation histogram
adapted from Ofir Pele
slide-22
SLIDE 22
  • c. keypoint’s orientation(s) = peak(s)

*

* the peak ;-)

slide-23
SLIDE 23

Text

  • 3. compute their descriptor
slide-24
SLIDE 24

SIFT descriptor = a set of orientation histograms

4x4 array x 8 bins = 128 dimensions (normalized) 16x16 neighborhood

  • f pixel gradients
slide-25
SLIDE 25

Text

  • 4. match them on other images
slide-26
SLIDE 26

How to atch?

nearest neighbor hough transform voting least-squares fit etc.

slide-27
SLIDE 27

SIFT is great!

Text

\\ invariant to affine transformations \\ easy to understand \\ fast to compute

slide-28
SLIDE 28

Extension example: Spatial Pyramid Matching using SIFT

Text

Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories

Svetlana Lazebnik1

slazebni@uiuc.edu

1Beckman Institute

University of Illinois

Cordelia Schmid2

Cordelia.Schmid@inrialpes.fr

2INRIA Rhˆ

  • ne-Alpes

Montbonnot, France

Jean Ponce1,3

ponce@cs.uiuc.edu

3Ecole Normale Sup´

erieure Paris, France

CVPR 2006

slide-29
SLIDE 29

Object Recognition from Local Scale-Invariant Features

David G. Lowe Computer Science Department University of British Columbia Vancouver, B.C., V6T 1Z4, Canada lowe@cs.ubc.ca Abstract An object recognition system has been developed that uses a new class of local image features. The features are invariant to image scaling, translation,and rotation, and partially in- variant to illuminationchanges and affine or 3D projection. translation, scaling, and rotation, and partially invariant to illumination changes and affine or 3D projection. Previous approaches to local feature generation lacked invariance to scale and were more sensitive to projective distortion and illumination change. The SIFT features share a number of properties in common with the responses of neurons in infe-

Lowe (1999)

Histograms of Oriented Gradients for Human Detection

Navneet Dalal and Bill Triggs INRIA Rhˆ
  • ne-Alps, 655 avenue de l’Europe, Montbonnot 38334, France
{Navneet.Dalal,Bill.Triggs}@inrialpes.fr, http://lear.inrialpes.fr Abstract We study the question of feature sets for robust visual ob- ject recognition, adopting linear SVM based human detec- tion as a test case. After reviewing existing edge and gra- dient based descriptors, we show experimentally that grids
  • f Histograms of Oriented Gradient (HOG) descriptors sig-
nificantly outperform existing feature sets for human detec-
  • tion. We study the influence of each stage of the computation
We briefly discuss previous work on human detection in §2, give an overview of our method §3, describe our data sets in §4 and give a detailed description and experimental evaluation of each stage of the process in §5–6. The main conclusions are summarized in §7. 2 Previous Work There is an extensive literature on object detection, but here we mention just a few relevant papers on human detec-

Nalal and Triggs (2005)

A Discriminatively Trained, Multiscale, Deformable Part Model

Pedro Felzenszwalb University of Chicago pff@cs.uchicago.edu David McAllester Toyota Technological Institute at Chicago mcallester@tti-c.org Deva Ramanan UC Irvine dramanan@ics.uci.edu Abstract This paper describes a discriminatively trained, multi- scale, deformable part model for object detection. Our sys- tem achieves a two-fold improvement in average precision
  • ver the best performance in the 2006 PASCAL person de-
tection challenge. It also outperforms the best results in the 2007 challenge in ten out of twenty categories. The system relies heavily on deformable parts. While deformable part models have become quite popular, their value had not been demonstrated on difficult benchmarks such as the PASCAL Figure 1. Example detection obtained with the person model. The model is defined by a coarse template, several higher resolution

Felzenszwalb et al. (2008)

slide-30
SLIDE 30

Histograms of Oriented Gradients for Human Detection

Navneet Dalal and Bill Triggs INRIA Rhˆ

  • ne-Alps, 655 avenue de l’Europe, Montbonnot 38334, France

{Navneet.Dalal,Bill.Triggs}@inrialpes.fr, http://lear.inrialpes.fr Abstract

We study the question of feature sets for robust visual ob- ject recognition, adopting linear SVM based human detec- tion as a test case. After reviewing existing edge and gra- dient based descriptors, we show experimentally that grids

  • f Histograms of Oriented Gradient (HOG) descriptors sig-

nificantly outperform existing feature sets for human detec-

  • tion. We study the influence of each stage of the computation
  • n performance, concluding that fine-scale gradients, fine
  • rientation binning, relatively coarse spatial binning, and

high-quality local contrast normalization in overlapping de- scriptor blocks are all important for good results. The new We briefly discuss previous work on human detection in §2, give an overview of our method §3, describe our data sets in §4 and give a detailed description and experimental evaluation of each stage of the process in §5–6. The main conclusions are summarized in §7.

2 Previous Work

There is an extensive literature on object detection, but here we mention just a few relevant papers on human detec- tion [18,17,22,16,20]. See [6] for a survey. Papageorgiou et al [18] describe a pedestrian detector based on a polynomial SVM using rectified Haar wavelets as input descriptors, with

first of all, let me put this paper in context

slide-31
SLIDE 31

Histograms of Oriented Gradients for Human Detection

Navneet Dalal and Bill Triggs INRIA Rhˆ

  • ne-Alps, 655 avenue de l’Europe, Montbonnot 38334, France

{Navneet.Dalal,Bill.Triggs}@inrialpes.fr, http://lear.inrialpes.fr

histograms of local image measurement have been quite successful

Swain & Ballard 1991 - Color Histograms Schiele & Crowley 1996 - Receptive Fields Histograms Lowe 1999 - SIFT Schneiderman & Kanade 2000 - Localized Histograms of Wavelets Leung & Malik 2001 - Texton Histograms Belongie et al. 2002 - Shape Context Dalal & Triggs 2005 - Dense Orientation Histograms ...

λ λ λ
slide-32
SLIDE 32

Histograms of Oriented Gradients for Human Detection

Navneet Dalal and Bill Triggs INRIA Rhˆ

  • ne-Alps, 655 avenue de l’Europe, Montbonnot 38334, France

{Navneet.Dalal,Bill.Triggs}@inrialpes.fr, http://lear.inrialpes.fr

tons of “feature sets” have been proposed

features

Gravrila & Philomen 1999 - Edge Templates + Nearest Neighbor Papageorgiou & Poggio 2000, Mohan et al. 2001, DePoortere et al. 2002 - Haar Wavelets + SVM Viola & Jones 2001 - Rectangular Differential Features + AdaBoost Mikolajczyk et al. 2004 - Parts Based Histograms + AdaBoost Ke & Sukthankar 2004 - PCA-SIFT ...

slide-33
SLIDE 33

Histograms of Oriented Gradients for Human Detection

Navneet Dalal and Bill Triggs INRIA Rhˆ

  • ne-Alps, 655 avenue de l’Europe, Montbonnot 38334, France

{Navneet.Dalal,Bill.Triggs}@inrialpes.fr, http://lear.inrialpes.fr

localizing humans in images is a challenging task...

difficult!

Wide variety of articulated poses Variable appearance/clothing Complex backgrounds Unconstrained illuminations Occlusions Different scales ...

slide-34
SLIDE 34

Approach

  • robust feature set (HOG)
  • simple classifier (linear SVM)
  • fast detection (sliding window)
slide-35
SLIDE 35 adapted from Bill Triggs
slide-36
SLIDE 36
  • Gamma normalization
  • Space: RGB, LAB or Gray
  • Method: SQRT or LOG
slide-37
SLIDE 37
  • Filtering with simple

masks

diagonal Sobel uncentered centered cubic-corrected * centered performs the best *

slide-38
SLIDE 38
  • Filtering with simple

masks

centered

remember SIFT ?

slide-39
SLIDE 39

...after filtering, each “pixel” represents an oriented gradient...

slide-40
SLIDE 40

...pixels are regrouped in “cells”, they cast a weighted vote for an

  • rientation histogram...

HOG (Histogram of Oriented Gradients)

slide-41
SLIDE 41

a window can be represented like that

slide-42
SLIDE 42

then, cells are locally normalized using overlapping “blocks”

slide-43
SLIDE 43

they used two types of blocks

  • rectangular
  • similar to SIFT (but dense)
  • circular
  • similar to Shape Context
slide-44
SLIDE 44

and four different types of block

normalization

slide-45
SLIDE 45

like SIFT, they gain invariance... ...to illuminations, small deformations, etc.

slide-46
SLIDE 46

finally, a sliding window is classified by a simple linear SVM

slide-47
SLIDE 47

during the learning phase, the algorithm “looked” for hard examples

Training

adapted from Martial Hebert
slide-48
SLIDE 48

average gradients positive weights negative weights

slide-49
SLIDE 49

Example

slide-50
SLIDE 50

Example

adapted from Bill Triggs
slide-51
SLIDE 51

Example

adapted from Martial Hebert
slide-52
SLIDE 52

Results

10 −6 10 −5 10 −4 10 −3 10 −2 10 −1 0.01 0.02 0.05 0.1 0.2 DET − different descriptors on MIT database false positives per window (FPPW) miss rate
  • Lin. R−HOG
  • Lin. C−HOG
  • Lin. EC−HOG
Wavelet PCA−SIFT
  • Lin. G−ShaceC
  • Lin. E−ShaceC
MIT best (part) MIT baseline 10 −6 10 −5 10 −4 10 −3 10 −2 10 −1 0.01 0.02 0.05 0.1 0.2 0.5 DET − different descriptors on INRIA database false positives per window (FPPW) miss rate
  • Ker. R−HOG
  • Lin. R2−HOG
  • Lin. R−HOG
  • Lin. C−HOG
  • Lin. EC−HOG
Wavelet PCA−SIFT
  • Lin. G−ShapeC
  • Lin. E−ShapeC

Figure 3. The performance of selected detectors on (left) MIT and (right) INRIA data sets. See the text for details.

not good good

90% @ 1e-5 FPPW

slide-53
SLIDE 53

Experiments

10 −6 10 −5 10 −4 10 −3 10 −2 10 −1 0.01 0.02 0.05 0.1 0.2 0.5 DET − effect of gradient scale σ false positives per window (FPPW) miss rate σ=0 σ=0.5 σ=1 σ=2 σ=3 σ=0, c−cor 10 −6 10 −5 10 −4 10 −3 10 −2 10 −1 0.01 0.02 0.05 0.1 0.2 0.5 DET − effect of number of orientation bins β false positives per window (FPPW) miss rate bin= 9 (0−180) bin= 6 (0−180) bin= 4 (0−180) bin= 3 (0−180) bin=18 (0−360) bin=12 (0−360) bin= 8 (0−360) bin= 6 (0−360) 10 −5 10 −4 10 −3 10 −2 0.02 0.05 0.1 0.2 DET − effect of normalization methods false positives per window (FPPW) miss rate L2−Hys L2−norm L1−Sqrt L1−norm No norm Window norm

(a) (b) (c)

10 −6 10 −5 10 −4 10 −3 10 −2 10 −1 0.01 0.02 0.05 0.1 0.2 0.5 DET − effect of overlap (cell size=8, num cell = 2x2, wt=0) false positives per window (FPPW) miss rate
  • verlap = 3/4, stride = 4
  • verlap = 1/2, stride = 8
  • verlap = 0, stride =16
10 −6 10 −5 10 −4 10 −3 10 −2 10 −1 0.01 0.02 0.05 0.1 0.2 0.5 DET − effect of window size false positives per window (FPPW) miss rate 64x128 56x120 48x112 10 −6 10 −5 10 −4 10 −3 10 −2 10 −1 0.01 0.02 0.05 0.1 0.2 0.5 DET − effect of kernel width,γ, on kernel SVM false positives per window (FPPW) miss rate Linear γ=8e−3 γ=3e−2 γ=7e−2

(d) (e) (f)

Figure 4. For details see the text. (a) Using fine derivative scale significantly increases the performance. (‘c-cor’ is the 1D cubic-corrected point derivative). (b) Increasing the number of orientation bins increases performance significantly up to about 9 bins spaced over 0◦– 180◦. (c) The effect of different block normalization schemes (see §6.4). (d) Using overlapping descriptor blocks decreases the miss rate by around 5%. (e) Reducing the 16 pixel margin around the 64×128 detection window decreases the performance by about 3%. (f) Using a Gaussian kernel SVM, exp(−γx1 − x22), improves the performance by about 3%.
slide-54
SLIDE 54

Experiments

t

  • ).

r i- ,

  • er

4x4 6x6 8x8 10x10 12x12

Cell size (pixels)

1x1 2x2 3x3 4x4 Block size (Cells) 5 10 15 20

Miss Rate (%)

Figure 5. The miss rate at 10−4 FPPW as the cell and block sizes

  • change. The stride (block overlap) is fixed at half of the block size.

3×3 blocks of 6×6 pixel cells perform best, with 10.4% miss rate.

slide-55
SLIDE 55

Further Development

  • Detection on Pascal VOC (2006)
  • Human Detection in Movies (ECCV 2006)
  • US Patent by MERL (2006)
  • Stereo Vision HoG (ICVES 2008)
slide-56
SLIDE 56

Extension example:

Pyramid HoG++

slide-57
SLIDE 57

A simple demo...

slide-58
SLIDE 58

A simple demo...

VIDEO HERE

slide-59
SLIDE 59

so, it doesn’t work ?!? no no, it works... ...it just doesn’t work well...

slide-60
SLIDE 60

Object Recognition from Local Scale-Invariant Features

David G. Lowe Computer Science Department University of British Columbia Vancouver, B.C., V6T 1Z4, Canada lowe@cs.ubc.ca Abstract An object recognition system has been developed that uses a new class of local image features. The features are invariant to image scaling, translation,and rotation, and partially in- variant to illuminationchanges and affine or 3D projection. translation, scaling, and rotation, and partially invariant to illumination changes and affine or 3D projection. Previous approaches to local feature generation lacked invariance to scale and were more sensitive to projective distortion and illumination change. The SIFT features share a number of properties in common with the responses of neurons in infe-

Lowe (1999)

Histograms of Oriented Gradients for Human Detection

Navneet Dalal and Bill Triggs INRIA Rhˆ
  • ne-Alps, 655 avenue de l’Europe, Montbonnot 38334, France
{Navneet.Dalal,Bill.Triggs}@inrialpes.fr, http://lear.inrialpes.fr Abstract We study the question of feature sets for robust visual ob- ject recognition, adopting linear SVM based human detec- tion as a test case. After reviewing existing edge and gra- dient based descriptors, we show experimentally that grids
  • f Histograms of Oriented Gradient (HOG) descriptors sig-
nificantly outperform existing feature sets for human detec-
  • tion. We study the influence of each stage of the computation
We briefly discuss previous work on human detection in §2, give an overview of our method §3, describe our data sets in §4 and give a detailed description and experimental evaluation of each stage of the process in §5–6. The main conclusions are summarized in §7. 2 Previous Work There is an extensive literature on object detection, but

Nalal and Triggs (2005)

A Discriminatively Trained, Multiscale, Deformable Part Model

Pedro Felzenszwalb University of Chicago pff@cs.uchicago.edu David McAllester Toyota Technological Institute at Chicago mcallester@tti-c.org Deva Ramanan UC Irvine dramanan@ics.uci.edu Abstract This paper describes a discriminatively trained, multi- scale, deformable part model for object detection. Our sys- tem achieves a two-fold improvement in average precision
  • ver the best performance in the 2006 PASCAL person de-
tection challenge. It also outperforms the best results in the 2007 challenge in ten out of twenty categories. The system relies heavily on deformable parts. While deformable part models have become quite popular, their value had not been demonstrated on difficult benchmarks such as the PASCAL Figure 1. Example detection obtained with the person model. The model is defined by a coarse template, several higher resolution

Felzenszwalb et al. (2008)

slide-61
SLIDE 61

This paper describes one

  • f the best algorithm in
  • bject detection...
slide-62
SLIDE 62

They used the following methods:

HOG Features Deformable Part Model Latent SVM

slide-63
SLIDE 63

They used the following methods:

HOG Features

Introduced by Dalal & Triggs (2005)

slide-64
SLIDE 64

They used the following methods:

Deformable Part Model

Introduced by Fischler & Elschlager (1973)

slide-65
SLIDE 65

They used the following methods:

Latent SVM

Introduced by the authors

slide-66
SLIDE 66

HOG Features

slide-67
SLIDE 67

Model Overview

detection root filter part filters deformation models

slide-68
SLIDE 68

HOG Features

// 8x8 pixel blocks window // features computed at different resolutions (pyramid)

slide-69
SLIDE 69

HOG Pyramid

slide-70
SLIDE 70

Deformable Part Model

slide-71
SLIDE 71

Deformable Part Model

// each part is a local property // springs capture spatial relationships // here, the springs can be “negative”

slide-72
SLIDE 72

root filter part filters deformable model Deformable Part Model

detection score =

sum of filter responses - deformation cost

slide-73
SLIDE 73

Deformable Part Model filters feature vector (at position p in the pyramid H) position relative to the root location coefficients of a quadratic function on the placement

score of a placement

slide-74
SLIDE 74

Latent SVM

slide-75
SLIDE 75

Latent SVM filters and deformation parameters features part displacements

slide-76
SLIDE 76

Latent SVM

slide-77
SLIDE 77

Bonus

// Data Mining Hard Negatives // Model Initialization

slide-78
SLIDE 78

Results

Pascal VOC 2006

slide-79
SLIDE 79

Results

Models learned

slide-80
SLIDE 80

Experiments ~ Dalal’s model ~ Dalal’s + LSVM

slide-81
SLIDE 81

Examples

errors

slide-82
SLIDE 82

A simple demo...

slide-83
SLIDE 83

A simple demo...

slide-84
SLIDE 84

Conclusions

so, it doesn’t work ?!? no no, it works... ...it just doesn’t work well... ...or there is a problem with the seat-computer interface...

slide-85
SLIDE 85

Conclusion

slide-86
SLIDE 86