Reconnaissance dobjets Reconnaissance d objets Reconnaissance d - - PowerPoint PPT Presentation

reconnaissance d objets reconnaissance d objets
SMART_READER_LITE
LIVE PREVIEW

Reconnaissance dobjets Reconnaissance d objets Reconnaissance d - - PowerPoint PPT Presentation

Reconnaissance dobjets Reconnaissance d objets Reconnaissance d objets Reconnaissance dobjets et vision et vision artificielle et vision et vision artificielle artificielle artificielle http://www.di.ens.fr/willow/teaching/recvis09


slide-1
SLIDE 1

Reconnaissance d’objets Reconnaissance d’objets Reconnaissance d objets Reconnaissance d objets et vision et vision artificielle artificielle et vision et vision artificielle artificielle

http://www.di.ens.fr/willow/teaching/recvis09 http://www.di.ens.fr/willow/teaching/recvis09 p g p g

L t 3 Lecture 3

A refresher on camera geometry A refresher on camera geometry Image alignment and 3D alignment g g g

slide-2
SLIDE 2

Check Check it it out!

  • ut!

Check Check it it out!

  • ut!

Cours Cours de “Computational photography” de “Computational photography” Cours Cours de Computational photography de Computational photography de de Frédo Frédo Durand Durand L L j di j di d 9h30 12h30 S ll I f 2 d 9h30 12h30 S ll I f 2 Le Le jeudi jeudi de 9h30 a 12h30 Salle Info 2 de 9h30 a 12h30 Salle Info 2

http://people.csail.mit.edu/fredo/Classes/Comp_Photo_ENS/ http://people.csail.mit.edu/fredo/Classes/Comp_Photo_ENS/

slide-3
SLIDE 3

N’oubliez pas N’oubliez pas! ! N oubliez pas N oubliez pas! !

Premier Premier exercice exercice de de programmation programmation du du Premier Premier exercice exercice de de programmation programmation du du le 27 le 27 octobre

  • ctobre

http://www.di.ens.fr/willow/teaching/recvis09/assignment1/ http://www.di.ens.fr/willow/teaching/recvis09/assignment1/

slide-4
SLIDE 4

Pinhole perspective equation ⎪ ⎪ ⎧ = x f x ' ' ⎪ ⎪ ⎪ ⎨ y f z f ' '

NOTE: z is always negative..

⎪ ⎪ ⎩ = z y f y ' '

slide-5
SLIDE 5

Affine models: Weak perspective projection p p p j

' where ' f m mx x = ⎨ ⎧ − =

is the magnification

where ' z m my y − = ⎩ ⎨ − =

is the magnification.

When the scene relief is small compared its distance from the Camera, m can be taken constant: weak perspective projection.

slide-6
SLIDE 6

Affine models: Orthographic projection ff g p p j

⎩ ⎨ ⎧ = x x ' '

When the camera is at a (roughly constant) distance

⎩ ⎨ = y y'

y from the scene, take m=1.

slide-7
SLIDE 7

Analytical camera geometry Analytical camera geometry

slide-8
SLIDE 8

Coordinate Changes: Pure Translations

OBP = OBOA + OAP ⇔

BP = AP + BOA

slide-9
SLIDE 9

Coordinate Changes: Pure Rotations

⎥ ⎥ ⎤ ⎢ ⎢ ⎡ =

B A B A B A B A B A B A B AR

j k j j j i i k i j i i . . . . . . ⎥ ⎥ ⎤ ⎢ ⎢ ⎡ =

T B A T B A

j i

[ ]

A B A B A B

k j i =

⎥ ⎥ ⎦ ⎢ ⎢ ⎣

B A B A B A B A B A B A A

k k k j k i j j j j . . . ⎥ ⎥ ⎦ ⎢ ⎢ ⎣

T B A B

k j

[ ]

A A A

k j i

slide-10
SLIDE 10

Coordinate Changes: Rotations about the z Axis

⎤ ⎡ sin cos θ θ ⎥ ⎥ ⎥ ⎤ ⎢ ⎢ ⎢ ⎡ − = cos sin sin cos θ θ θ θ R

B A

⎥ ⎦ ⎢ ⎣ 1

slide-11
SLIDE 11

A rotation matrix is characterized by the following properties: properties:

  • Its inverse is equal to its transpose, and

q p

  • its determinant is equal to 1.

Or equivalently:

  • Its rows (or columns) form a right-handed

Its rows (or columns) form a right handed

  • rthonormal coordinate system.
slide-12
SLIDE 12

Coordinate changes: g pure rotations

x x

B A

⎥ ⎤ ⎢ ⎡ ⎥ ⎤ ⎢ ⎡

[ ] [ ]

z y z y OP

B B B B B A A A A A

⎥ ⎥ ⎥ ⎦ ⎢ ⎢ ⎢ ⎣ = ⎥ ⎥ ⎥ ⎦ ⎢ ⎢ ⎢ ⎣ = k j i k j i z z ⎥ ⎦ ⎢ ⎣ ⎥ ⎦ ⎢ ⎣ P R P

A B A B =

slide-13
SLIDE 13

Coordinate Changes: Rigid Transformations Coordinate Changes: Rigid Transformations

A B A B A B

O P R P + =

⎥ ⎦ ⎤ ⎢ ⎣ ⎡ = ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ + = ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ = ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ 1 1 1 1 1 P T O P R P O R P

A B A A B A B A A T A B B A B

A A

⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ 1 1 1 1 1

slide-14
SLIDE 14

Pinhole perspective equation ⎪ ⎪ ⎧ = x f x ' ' ⎪ ⎪ ⎪ ⎨ y f z f ' '

NOTE: z is always negative..

⎪ ⎪ ⎩ = z y f y ' '

slide-15
SLIDE 15

The intrinsic parameters of a camera Units: k l : pixel/m k,l : pixel/m f : m α β : pixel α,β : pixel Physical image coordinates Normalized image coordinates coordinates

slide-16
SLIDE 16

The intrinsic parameters of a camera Calibration matrix Th ti The perspective projection equation

slide-17
SLIDE 17

The extrinsic parameters of a camera

slide-18
SLIDE 18

Perspective projections induce projective f i b l transformations between planes

slide-19
SLIDE 19

Affine cameras

Weak-perspective projection Paraperspective projection Paraperspective projection

slide-20
SLIDE 20

More affine cameras

Orthographic projection Parallel projection

slide-21
SLIDE 21

Weak perspective projection model Weak-perspective projection model

(p and P are in homogeneous coordinates)

r

(p n n m g n u n )

p = M P

(P is in homogeneous coordinates)

p M P p = A P + b

(neither p nor P is in hom. coordinates)

p

slide-22
SLIDE 22

Affine projections induce affine p j transformations from planes

  • nto their images.
  • nto their images.
slide-23
SLIDE 23

Image alignment task

?

  • It helps to be able to compare descriptors of local

patches surrounding interest points (cf last lecture).

  • This is not strictly necessary. We will concentrate

here on the geometry of the problem.

slide-24
SLIDE 24

Dealing with outliers

The set of putative matches still contains a very high percentage of outliers How do we fit a geometric transformation to a small subset of all possible matches? Possible strategies:

  • RANSAC
  • Incremental alignment
  • Hough transform
  • Hashing
slide-25
SLIDE 25

Strategy 1: RANSAC

RANSAC loop (Fischler & Bolles, 1981):

  • Randomly select a seed group of matches
  • Compute transformation from seed group

Fi d i li t thi t f ti

  • Find inliers to this transformation

If the number of inliers is sufficiently large re compute

  • If the number of inliers is sufficiently large, re-compute

least-squares estimate of transformation on all of the inliers

  • Keep the transformation with the largest number of

inliers inliers

slide-26
SLIDE 26

RANSAC example: Translation

Putative matches

slide-27
SLIDE 27

RANSAC example: Translation

Select one match, count inliers

slide-28
SLIDE 28

RANSAC example: Translation

Select one match, count inliers

slide-29
SLIDE 29

RANSAC example: Translation

Find “average” translation vector

slide-30
SLIDE 30

Strategy 2: Incremental alignment

Take advantage of strong locality constraints: only pick close-by matches to start with, and gradually add more matches in the same neighborhood Approach introduced in [Ayache & Faugeras, 1982; Hebert & Faugeras, 1983; Gaston & Lozano-Perez, 1984] Illustrated here with the method from S. Lazebnik, C. Schmid and J. Ponce, “Semi-local affine parts for

  • bject recognition” BMVC 2004
  • bject recognition , BMVC 2004
slide-31
SLIDE 31

Incremental alignment: Details

Generating seed groups:

  • Identify triples of neighboring features (i, j, k) in first image

y p g g ( , j, ) g

  • Find all triples (i', j', k') in the second image such that i' (resp.

j', k') is a putative match of i (resp. j, k), and j', k' are neighbors of i' neighbors of i

slide-32
SLIDE 32

Incremental alignment: Details

A Beginning with each seed triple repeat: Beginning with each seed triple, repeat:

  • Estimate the aligning transformation between corresponding features

in current group of matches

  • Grow the group by adding other consistent matches in the

neighborhood

U til th t f ti i l i t t Until the transformation is no longer consistent

  • r no more matches can be found
slide-33
SLIDE 33

Incremental alignment: Details

A Beginning with each seed triple repeat: Beginning with each seed triple, repeat:

  • Estimate the aligning transformation between corresponding features

in current group of matches

  • Grow the group by adding other consistent matches in the

neighborhood

U til th t f ti i l i t t Until the transformation is no longer consistent

  • r no more matches can be found
slide-34
SLIDE 34

Incremental alignment: Details

A Beginning with each seed triple repeat: Beginning with each seed triple, repeat:

  • Estimate the aligning transformation between corresponding features

in current group of matches

  • Grow the group by adding other consistent matches in the

neighborhood

U til th t f ti i l i t t Until the transformation is no longer consistent

  • r no more matches can be found
slide-35
SLIDE 35

Incremental alignment: Details

A Beginning with each seed triple repeat: Beginning with each seed triple, repeat:

  • Estimate the aligning transformation between corresponding features

in current group of matches

  • Grow the group by adding other consistent matches in the

neighborhood

U til th t f ti i l i t t Until the transformation is no longer consistent

  • r no more matches can be found
slide-36
SLIDE 36

Strategy 3: Hough transform

Suppose our features are scale- and rotation-covariant

  • Then a single feature match provides an alignment hypothesis

(translation scale orientation) (translation, scale, orientation) model David G. Lowe. “Distinctive image features from scale-invariant keypoints”, IJCV 60 (2), pp. 91-110, 2004.

slide-37
SLIDE 37

Strategy 3: Hough transform

Suppose our features are scale- and rotation-covariant

  • Then a single feature match provides an alignment hypothesis

(translation scale orientation) (translation, scale, orientation)

  • Of course, a hypothesis obtained from a single match is unreliable
  • Solution: let each match vote for its hypothesis in a Hough space

with very coarse bins model David G. Lowe. “Distinctive image features from scale-invariant keypoints”, IJCV 60 (2), pp. 91-110, 2004.

slide-38
SLIDE 38

Hough transform

  • An early type of voting scheme
  • General outline:
  • Discretize parameter space into bins
  • For each feature point in the image, put a vote in every bin in

th t th t ld h t d thi i t the parameter space that could have generated this point

  • Find bins that have the most votes

Image space Hough parameter space

P.V.C. Hough, Machine Analysis of Bubble Chamber Pictures, Proc.

  • Int. Conf. High Energy Accelerators and Instrumentation, 1959
slide-39
SLIDE 39

Parameter space representation

  • A line in the image corresponds to a point in Hough

space

Image space Hough parameter space

Source: K. Grauman

slide-40
SLIDE 40

Parameter space representation

  • What does a point (x0, y0) in the image space map to in

the Hough space?

  • Answer: the solutions of b = –x0m + y0
  • This is a line in Hough space

Image space Hough parameter space

Source: K. Grauman

slide-41
SLIDE 41

Parameter space representation

  • Where is the line that contains both (x0, y0) and

(x1,y1)?

  • It is the intersection of the lines b = –x0m + y0 and

b = –x1m + y1 Image space Hough parameter space

(x1, y1) (x0, y0) b = –x1m + y1

Source: K. Grauman

slide-42
SLIDE 42

Hough transform details (D. Lowe’s system)

Training phase: For each model feature, record 2D location, scale, and orientation of model (relative to normalized feature frame) Test phase: Let each match between a test and a f model feature vote in a 4D Hough space

  • Use broad bin sizes of 30 degrees for orientation, a factor of

2 for scale and 0 25 times image size for location 2 for scale, and 0.25 times image size for location

  • Vote for two closest bins in each dimension

Find all bins with at least three votes and perform p geometric verification

  • Estimate least squares affine transformation
  • Use stricter thresholds on transformation residual
  • Search for additional features that agree with the alignment
slide-43
SLIDE 43

Affine projections induce affine p j transformations from planes

  • nto their images.
  • nto their images.
slide-44
SLIDE 44

Affine transformations

An affine transformation maps a parallelogram onto another parallelogram p g

⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ' u b a a u ⎥ ⎥ ⎥ ⎤ ⎢ ⎢ ⎢ ⎡ ⎥ ⎥ ⎥ ⎤ ⎢ ⎢ ⎢ ⎡ = ⎥ ⎥ ⎥ ⎤ ⎢ ⎢ ⎢ ⎡ '

2 22 21 1 12 11

v u b a a b a a v u ⎥ ⎦ ⎢ ⎣ ⎥ ⎦ ⎢ ⎣ ⎥ ⎦ ⎢ ⎣ 1 1 1

slide-45
SLIDE 45

Fitting an affine transformation

Equation for affine transformation:

⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ' u b a a u ⎥ ⎥ ⎥ ⎤ ⎢ ⎢ ⎢ ⎡ ⎥ ⎥ ⎥ ⎤ ⎢ ⎢ ⎢ ⎡ = ⎥ ⎥ ⎥ ⎤ ⎢ ⎢ ⎢ ⎡ '

2 22 21 1 12 11

v u b a a b a a v u

9 entries, 6 degrees of freedom

⎥ ⎦ ⎢ ⎣ ⎥ ⎦ ⎢ ⎣ ⎥ ⎦ ⎢ ⎣ 1 1 1

2 equations in 6 unknowns g

⎥ ⎤ ⎢ ⎡

11

a

2 equations in 6 unknowns

⎥ ⎤ ⎢ ⎡ ⎥ ⎥ ⎥ ⎥ ⎢ ⎢ ⎢ ⎢ ⎥ ⎤ ⎢ ⎡ ' 1

1 12

u b a v u

U a = u’

⎥ ⎦ ⎢ ⎣ = ⎥ ⎥ ⎥ ⎥ ⎢ ⎢ ⎢ ⎢ ⎥ ⎦ ⎢ ⎣ ' 1

22 21

v a a v u

In general uniquely determined by 3 correspondences

⎥ ⎥ ⎦ ⎢ ⎢ ⎣

2

b

Linear least squares for more correspondences

slide-46
SLIDE 46

Strategy 4: Hashing

Make each invariant image feature into a low-dimensional “key” that indexes into a table of hypotheses

hash table model

slide-47
SLIDE 47

Strategy 4: Hashing

Make each invariant image feature into a low-dimensional “key” that indexes into a table of hypotheses Gi t t i t th h h k f ll f t Given a new test image, compute the hash keys for all features found in that image, access the table, and look for consistent hypotheses

hash table test image model g

slide-48
SLIDE 48

Strategy 4: Hashing

Make each invariant image feature into a low-dimensional “key” that indexes into a table of hypotheses Gi t t i t th h h k f ll f t Given a new test image, compute the hash keys for all features found in that image, access the table, and look for consistent hypotheses This can even work when we don’t have any feature descriptors: we can take n-tuples of neighboring features and compute invariant hash codes from their geometric configurations invariant hash codes from their geometric configurations

C B C D A

slide-49
SLIDE 49

Beyond affine transformations

What is the transformation between two views of a planar surface? What is the transformation between images from two g cameras that share the same center?

slide-50
SLIDE 50

Perspective projections induce projective f i b l transformations between planes

slide-51
SLIDE 51

Beyond affine transformations

Homography: plane projective transformation (transformation taking a quad to another arbitrary quad)

slide-52
SLIDE 52

Fitting a homography

Recall: homogenenous coordinates

Converting to homogenenous image coordinates Converting from homogenenous image coordinates

slide-53
SLIDE 53

Fitting a homography

Recall: homogenenous coordinates

Converting to homogenenous image coordinates Converting from homogenenous image coordinates

Equation for homography: q g p y

⎥ ⎤ ⎢ ⎡ ⎥ ⎤ ⎢ ⎡ ⎥ ⎤ ⎢ ⎡ ′

13 12 11

x h h h x ⎥ ⎥ ⎥ ⎦ ⎢ ⎢ ⎢ ⎣ ⎥ ⎥ ⎥ ⎦ ⎢ ⎢ ⎢ ⎣ = ⎥ ⎥ ⎥ ⎦ ⎢ ⎢ ⎢ ⎣ ′ 1 1

23 22 21

y h h h h h h y λ ⎥ ⎦ ⎢ ⎣ ⎥ ⎦ ⎢ ⎣ ⎥ ⎦ ⎢ ⎣ 1 1

33 32 31

h h h

slide-54
SLIDE 54

Fitting a homography

Equation for homography:

T

h ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ′ x h h h x

i T T i i

x h h h x H x ⎥ ⎥ ⎥ ⎤ ⎢ ⎢ ⎢ ⎡ = = ′

2 1

λ ⎥ ⎥ ⎥ ⎤ ⎢ ⎢ ⎢ ⎡ ⎥ ⎥ ⎥ ⎤ ⎢ ⎢ ⎢ ⎡ = ⎥ ⎥ ⎥ ⎤ ⎢ ⎢ ⎢ ⎡ ′

23 22 21 13 12 11 i i i i

y x h h h h h h y x λ

T

h ⎥ ⎦ ⎢ ⎣

3

⎥ ⎦ ⎢ ⎣ ⎥ ⎦ ⎢ ⎣ ⎥ ⎦ ⎢ ⎣ 1 1

33 32 31

h h h

⎤ ⎡ ′

T T

x h x h

9 entries, 8 degrees of freedom

= × ′

i i

x H x ⎥ ⎥ ⎥ ⎤ ⎢ ⎢ ⎢ ⎡ ′ − − = × ′

i T i i T i i i i i

x y x h x h x h x h x H x

3 1 2 3

g (scale is arbitrary)

×

i i

x H x ⎥ ⎦ ⎢ ⎣ ′ − ′

i T i i T i

y x x h x h

1 2

⎞ ⎛ ⎤ ⎡

T T T 2 1

= ⎟ ⎟ ⎟ ⎞ ⎜ ⎜ ⎜ ⎛ ⎥ ⎥ ⎤ ⎢ ⎢ ⎡ ′ − ′ − h h x x x x

T i i T T i T i i T i T

x y

3 equations, only 2 linearly

3 ⎟

⎠ ⎜ ⎝ ⎥ ⎥ ⎦ ⎢ ⎢ ⎣ ′ ′ − h x x

T T i i T i i

x y

independent

slide-55
SLIDE 55

Direct linear transform

1 1 1

⎞ ⎛ ⎥ ⎥ ⎤ ⎢ ⎢ ⎡ ′ ′ − h x x

T T T T T T

y

2 1 1 1 1

= ⎟ ⎟ ⎟ ⎞ ⎜ ⎜ ⎜ ⎛ ⎥ ⎥ ⎥ ⎥ ⎢ ⎢ ⎢ ⎢ ′ − h h x x

T T T

x L L L = h A

3 2

⎟ ⎟ ⎠ ⎜ ⎜ ⎝ ⎥ ⎥ ⎥ ⎦ ⎢ ⎢ ⎢ ⎣ ′ ′ − h x x

T T T T n n T n T

y

H has 8 degrees of freedom (9 parameters but scale is

⎥ ⎦ ⎢ ⎣ ′ − x x

T n n T T n

x

H has 8 degrees of freedom (9 parameters, but scale is arbitrary) One match gives us two linearly independent equations Four matches needed for a minimal solution (null space

  • f 8x9 matrix)

M th f h l t More than four: homogeneous least squares

slide-56
SLIDE 56

Application: Panorama stitching

Images courtesy of A. Zisserman.

slide-57
SLIDE 57

Recognizing panoramas

Given contents of a camera memory card, automatically figure out which pictures go together and stitch them together into panoramas

  • M. Brown and D. Lowe, “Recognizing panoramas”, ICCV 2003.
slide-58
SLIDE 58
  • 1. Estimate homography (RANSAC)
slide-59
SLIDE 59
  • 1. Estimate homography (RANSAC)
slide-60
SLIDE 60
  • 1. Estimate homography (RANSAC)
slide-61
SLIDE 61
  • 2. Find connected sets of images
slide-62
SLIDE 62
  • 2. Find connected sets of images
slide-63
SLIDE 63
  • 2. Find connected sets of images
slide-64
SLIDE 64
  • 3. Stitch and blend the panoramas
slide-65
SLIDE 65

Results

slide-66
SLIDE 66

Issues in alignment-based applications

Choosing the geometric alignment model

  • Tradeoff between “correctness” and robustness (also
  • Tradeoff between correctness and robustness (also,

efficiency)

Choosing the descriptor g p

  • “Rich” imagery (natural images): high-dimensional patch-based

descriptors (e.g., SIFT) “I i h d” i ( t fi ld ) d t t

  • “Impoverished” imagery (e.g., star fields): need to create

invariant geometric descriptors from k-tuples of point-based features

Strategy for finding putative matches

  • Small number of images, one-time computation (e.g., panorama

tit hi ) b t f h stitching): brute force search

  • Large database of model images, frequent queries: indexing or

hashing

  • Heuristics for feature-space pruning of putative matches
slide-67
SLIDE 67

Issues in alignment-based applications

Choosing the geometric alignment model Choosing the descriptor Choosing the descriptor Strategy for finding putative matches Hypothesis generation strategy Hypothesis generation strategy

  • Relatively large inlier ratio: RANSAC
  • Small inlier ratio: locality constraints Hough transform

Small inlier ratio: locality constraints, Hough transform

Hypothesis verification strategy

  • Size of consensus set, residual tolerance depend on inlier ratio

, p and expected accuracy of the model

  • Possible refinement of geometric model

D ifi ti

  • Dense verification
slide-68
SLIDE 68

Affine Patches for 3D Alignment for 3D Alignment

Repeatibility, covariance, invariance

Tell & Carlsson (2000); Kadir & Brady (2001); Matas et al. (2001); Tuytelaars & Van Gool (2002)

invariance

slide-69
SLIDE 69

Modeling and g recognizing 3D rigid solids

Johnson & Hebert (1998); Lowe (1999)

rigid solids

Idea : S = M×N

(1999)

Idea :

  • The (smooth) surface of

a solid is never globally

S = M×N

g y planar,

  • but it is always locally

l

S → M , N E ←|S -M N|

planar Rothganger et al (CVPR’03) Tomasi & Kanade (1992)

| |

Rothganger et al. (CVPR 03)

Duda & Hart (1972); Weiss (1987); Burns et al. (1992); Mundy et al. (1992, 1994); Rothwell et al. (1992) Ayache & Faugeras (1982); Hebert & Faugeras (1983); Gaston et al. (1984); Huttenlocher & Ullman (1987)

slide-70
SLIDE 70

20 images

slide-71
SLIDE 71
slide-72
SLIDE 72

Dataset: 51 test images with 1 to 5 of the 8 objects present in each image.

slide-73
SLIDE 73
slide-74
SLIDE 74
slide-75
SLIDE 75

The four failures Some successes