[PDF] - COMP37111 Advanced Computer Graphics 5: Model Acquisition - 3 PDF Document

SLIDE 1

COMP37111 Toby Howard School of Computer Science The University of Manchester 1

The University

f Manchester

5: Model Acquisition - 3

COMP37111 Advanced Computer Graphics

toby.howard@manchester.ac.uk

1

The University

f Manchester

2

Geometry from images

§ Our goal: to automatically extract geometry from images § This is a hard problem, and the subject of much current research… § We will look at the concepts, not the mathematical detail § We’ll cover:

1. Geometry from static images (photographs) 2. Geometry from image sequences (video)

SLIDE 2

COMP37111 Toby Howard School of Computer Science The University of Manchester 2

The University

f Manchester

3

What is a camera?

§ The basic “pinhole camera” model § Point P forms an image P’ on an Image Plane at a distance F (the focal length) away from the optical centre O (lens)

K J I Image Plane P O k j F Z Y X O Camera frame World frame P’

ptical centre

The University

f Manchester

4

Camera parameters

§ Extrinsic parameters define:

§ location of camera origin with respect to world frame § orientation of camera frame with respect to world frame

§ Intrinsic parameters define how coordinates in camera frame map to pixel coordinates on image plane:

§ Focal length § Image plane size § Position of image plane § Image plane “skew” (angle between rows and columns of CCD sensor)

§ If we can find the Extrinsic and Intrinsic camera parameters for an image, we say the image is “calibrated”, and we can then extract geometry from the image

SLIDE 3

COMP37111 Toby Howard School of Computer Science The University of Manchester 3

The University

f Manchester

5

Intrinsic parameters

§ The intrinsic parameters of a camera are expressed by its calibration matrix K :

⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎣ ⎡ ⋅ ⋅ ⋅ = 1

y y x x

O S f O S f S f K

θ

§ Where

§ Sx and Sy are the dimensions of the image plane (in pixels) § Ox and Oy are the coordinates of the centre of the image plane with respect to the focal point § Sθ is the skew of the image plane (deviation from square) § f is the focal length of the camera

The University

f Manchester

6

Finding the calibration matrix

§ How can we find K for a camera? § If we have the actual camera:

§ Photograph a known calibration object, and calculate K directly from the resulting image

§ If we only have an image (which is the general case):

§ We can estimate K by finding calibration features in the image:

§ Parallel lines § Orthogonal lines Calibration object Calibration features

SLIDE 4

COMP37111 Toby Howard School of Computer Science The University of Manchester 4

The University

f Manchester

7

Calculating K using a calibration object

The University

f Manchester

8

Calibration features

Estimating K using calibration features in an image

SLIDE 5

COMP37111 Toby Howard School of Computer Science The University of Manchester 5

The University

f Manchester

9

Lens distortion

§ But before we try to do anything with an image, we must correct for distortions caused by the camera lens § All lenses distort to some extent

The University

f Manchester

10

“Straight lines” aren’t straight

Barrel distortion in an image

SLIDE 6

COMP37111 Toby Howard School of Computer Science The University of Manchester 6

The University

f Manchester

11

Barrel distortion in an image

The University

f Manchester

12

Removing distortion using image warping

§ Removing distortion is an example of image warping § The user selects a distorted straight edge E § Then marks a line L on the image to indicate how the

riginal edge should have

looked § We can then create a non- linear WARP transformation which maps L to E

E L

SLIDE 7

COMP37111 Toby Howard School of Computer Science The University of Manchester 7

The University

f Manchester

13

Barrel distortion in an image

The University

f Manchester

14

Barrel distortion in an image – removed by warping

SLIDE 8

COMP37111 Toby Howard School of Computer Science The University of Manchester 8

The University

f Manchester

15

Calibrating a single image

§ We can calibrate from a single image IF we can identify at least 2 vanishing points, and we choose an origin

rigin

X axis X axis Y axis Y axis

User marks parallel lines and

rthogonal

lines

The University

f Manchester

16

Case study 3: Geometry reconstruction with Icarus

§ In the Icarus system, the user interactively matches features in a calibrated image with a set of pre-defined shapes (cubes, spheres, cylinders etc) § Because the image is calibrated, the synthetic objects can be accurately positioned, scaled and rotated into the scene § Paper to read: S. Gibson and T.L.J. Howard, “Interactive reconstruction of virtual environments from photographs, with application to scene-of- crime analysis.”

SLIDE 9

COMP37111 Toby Howard School of Computer Science The University of Manchester 9

The University

f Manchester

17

Original single image

The University

f Manchester

18

Reconstructed geometry

User interactively approximates scene geometry by drawing primitive shapes in the calibrated image space.

SLIDE 10

COMP37111 Toby Howard School of Computer Science The University of Manchester 10

The University

f Manchester

19

Geometry, simply rendered

The approximated geometry can then be rendered. The University

f Manchester

20

Textures extracted and rendered

Or we can grab the pixel colours from the original image and map these as textures onto the reconstructed primitives.

SLIDE 11

COMP37111 Toby Howard School of Computer Science The University of Manchester 11

The University

f Manchester

21

Novel view of 3D geometry

Because this is a true 3D geometric model, we can view it from any angle. However, some views may require information that was not available from the

riginal image sequence. This is ongoing research.

The University

f Manchester

22

Calibrating from a single image

§ Unfortunately for this method, many images simply do not feature clearly parallel lines… § So we tend to work with sets of multiple images

SLIDE 12

COMP37111 Toby Howard School of Computer Science The University of Manchester 12

The University

f Manchester

23

Calibrating from multiple images

§ Suppose we have at least 2 images of a scene, taken with the same camera § If we can identify at least 8 corresponding points in each image, we can estimate the intrinsic and extrinsic camera parameters § For details, see the following papers (not examinable: for interest only, but we recommend you read them, available on the course webpage)

§ H.C.Longuet-Higgins, “A Computer Algorithm for Reconstructing a Scene from Two Projections,” Nature, vol 293, 1981, pp. 133-135. § R. Hartley, “In Defense of The Eight-Point Algorithm”

The University

f Manchester

24

Calibrating from multiple images

§ The user interactively marks corresponding points in each image

rigin

SLIDE 13

COMP37111 Toby Howard School of Computer Science The University of Manchester 13

The University

f Manchester

25

X

World frame and camera frames

Image 2 Image 1 View 1 camera frame View 2 camera frame

Y Z

world frame The University

f Manchester

26

X X Z Y Y Z

rigin

§ How the world frame would be seen from each view

Calibrating from multiple images

SLIDE 14

COMP37111 Toby Howard School of Computer Science The University of Manchester 14

The University

f Manchester

27

§ We now have a calibrated system § In other words, we know:

§ The position and orientation of the world frame § For each view we know the position and orientation of the camera frame (with respect to the world frame)

§ Now, we can interactively draw in 3D with the correct camera parameters

Extraction of geometry

The University

f Manchester

28

§ We draw vertices over the images (compare Mme Gouraud), and since we know the camera positions, we can compute 3D coordinates of each vertex

Interactive extraction of geometry

SLIDE 15

COMP37111 Toby Howard School of Computer Science The University of Manchester 15

The University

f Manchester

29

Raw polygon mesh Rendered with textures § We can sample pixel colours from the original images and map them onto the polygon mesh as textures

Extracting pixel colours from the image

The University

f Manchester

30

Geometry from video sequences

§ What is a video sequence?

§ It’s a set of images – 24 images per second § Usually, there is coherence between frames too (if no edits)

§ Goal: to automatically calibrate the camera in each frame, to

btain the camera motion for the whole sequence

§ Approach: As before, we identify corresponding points between images… § … but this time each “image” is a frame from a video sequence § We try to do this automatically, using computer vision techniques

SLIDE 16

COMP37111 Toby Howard School of Computer Science The University of Manchester 16

The University

f Manchester

31

Automatic feature detection

The University

f Manchester

32

These features have been automatically identified

Automatic feature detection

SLIDE 17

COMP37111 Toby Howard School of Computer Science The University of Manchester 17

The University

f Manchester

33

Feature detection

§ How can we automatically detect suitable features? § This is a very hard problem… § … once again we find that humans can detect features instantly (without even trying), but for a computer it is HARD § We use image-processing techniques, to analyse the pixels to look for edges or corners in the image § This is a whole topic in itself § One example: The Canny Edge Detector

§ John Canny: “A computational approach to edge detection” (IEEE Transactions on Pattern Analysis and Machine Intelligence 8(6), 1986)

The University

f Manchester

34

The Canny edge detector

The algorithm works on a grey-scale image and searches for pixel groupings which look like edges (see COMP27112)

SLIDE 18

COMP37111 Toby Howard School of Computer Science The University of Manchester 18

The University

f Manchester

35

The Canny edge detector

The University

f Manchester

36

From edges to corners

Likely candidate for a corner

SLIDE 19

COMP37111 Toby Howard School of Computer Science The University of Manchester 19

The University

f Manchester

37

Two important corner feature detection algorithms

§ SUSAN: Smallest Univalue Segment Assimilating Nucleus Smith S.M. and Brady J.M., “SUSAN: A New Approach to Low Level Image Processing”, International Journal of Computer Vision, Volume 23(1), pp. 45-78, 1997. § SIFT: Scale-Invariant Feature Transform Lowe, D. G., “Distinctive Image Features from Scale-Invariant Keypoints”, International Journal of Computer Vision, 60(2),

pp. 91-110, 2004.

§ Both papers are on the course webpage (for interest only, not examinable)

The University

f Manchester

38

SUSAN: the basic idea

Centre of pixel being examined

Adapted from Steve Smith’s paper

Mask region associated with pixel Dark area in image Light area in image

§ We examine the relationship between the value of a pixel, and the average brightness of

ther pixels in a “mask region”

associated with it

About 30 pixels wide

Analysing the changes of relative brightness in a mask can estimate the existence of corners

SLIDE 20

COMP37111 Toby Howard School of Computer Science The University of Manchester 20

The University

f Manchester

39

Automatic feature detection

The University

f Manchester

40

§ For each automatically-identified feature, we now attempt to track it from frame N to frame N+1 (then N+2… etc) § Then, for each pair of adjacent frames, we can estimate the camera calibration (remember the camera is jiggling about)

Tracking features between frames

Frame N Frame N +1 Frame N +2 (camera is moving left to right)

SLIDE 21

COMP37111 Toby Howard School of Computer Science The University of Manchester 21

The University

f Manchester

41

Feature tracking

§ How can we automatically track features between frames? § This is a very hard problem… § … humans can track features effortessly, but for a computer it is HARD § Example: look at this feature, moving diagonally: Through the window, we can only see horizontal motion

The University

f Manchester

42

Tracking features between frames

The lines show a history of where the feature was in previous frames

SLIDE 22

COMP37111 Toby Howard School of Computer Science The University of Manchester 22

The University

f Manchester

43

Result of calibration

features Path of camera

rigin

Camera orientation The University

f Manchester

44

Geometry reconstruction

User interactively approximates scene geometry by drawing primitive shapes in the calibrated image space.

SLIDE 23

COMP37111 Toby Howard School of Computer Science The University of Manchester 23

The University

f Manchester

45

Geometry reconstruction

User interactively approximates scene geometry by drawing primitive shapes in the calibrated image space. The University

f Manchester

46

Geometry reconstruction

The aproximated geometry can then be rendered. In this example we’re using the simple OpenGL lighting model.

SLIDE 24

COMP37111 Toby Howard School of Computer Science The University of Manchester 24

The University

f Manchester

47

Geometry reconstruction

Or we can grab the pixel colours from the original image and map these as textures onto the reconstructed primitives. The University

f Manchester

48

Geometry reconstruction

Because this is a true 3D geometric model, we can view it from any angle. However, some views may require information that was not available from the

riginal image sequence. This is ongoing research.

SLIDE 25

COMP37111 Toby Howard School of Computer Science The University of Manchester 25

The University

f Manchester

49

Augmented reality

Because the video sequence is calibrated, we can render synthetic CG objects using the same camera parameters, and composite them into the video. This is

ngoing research.

The University

f Manchester

50

Summary

§ In order to extract geometry from an image, we need to know the camera calibration § We can estimate camera calibrations from images / image sequences § We then extract geometry by interactively approximating the scene geometry with synthetic objects, rendered using transformations which match the extracted camera calibration.