ROBOTICS 01PEEQW Basilio Bona DAUIN Politecnico di Torino Mobile - - PDF document
ROBOTICS 01PEEQW Basilio Bona DAUIN Politecnico di Torino Mobile - - PDF document
04/05/2015 ROBOTICS 01PEEQW Basilio Bona DAUIN Politecnico di Torino Mobile & Service Robotics Sensors for Robotics 4 di 23 1 04/05/2015 Vision Vision is the most important sense in humans and is becoming important also
04/05/2015 di 23 2
Vision
Vision is the most important sense in humans and is becoming important also in robotics
not expensive rich of information
Vision includes three steps
Data recording and transformation in the retina Data transmission through the
- ptical nerves
Data elaboration by the brain
3 ROBOTICS 01PEEQW - 2014/2015
CMOS (Complementary Metal Oxide Semiconductor technology)
Vision sensors: hardware
4
CCD (Coupled Charge Device, light-sensitive, discharging capacitors of 5 to 25 micron)
ROBOTICS 01PEEQW - 2014/2015
04/05/2015 di 23 3
Parallel lines Converging lines
Artificial vision issues
Projection from a 3D world on a 2D plane: perspective projection (transformation matrices) Discretization effects due to pixels (CCD or CMOS) Misalignment errors (hardware)
5
Pixel discretization
ROBOTICS 01PEEQW - 2014/2015
Camera models
Pinhole camera (aka perspective camera)
6 ROBOTICS 01PEEQW - 2014/2015
04/05/2015 di 23 4
Pinhole camera
7 ROBOTICS 01PEEQW - 2014/2015
A B image planes
Decreasing the image plane distance or the hole diameter makes the point images sharper Increasing the hole diameter makes the point images brighter Infinite depth-of-field Infinite depth-of-focus
point images point images
hole diameter
the image is reversed
Camera models
Thin lens camera: the lens has a thickness d that is negligible compared to the radii of curvature of the lens surfaces R1, R2 Rays are refracted as they go through the lens (refraction index n)
8 ROBOTICS 01PEEQW - 2014/2015
1 2
1 1 1 ( 1) ; 0 if convex
i
n R f R R ≈ − − >
Thin lens equation
04/05/2015 di 23 5
Thin lens camera
Thin lens camera is reversible Rays parallel to the optical axis pass through the focus and viceversa Rays through the lens center are not refracted There are two symmetrical foci True lens shows aberration phenomena
9 ROBOTICS 01PEEQW - 2014/2015
f f
- ptical axis
lens center
Aberration
10 ROBOTICS 01PEEQW - 2014/2015
Spherical
04/05/2015 di 23 6
Image formation
11 ROBOTICS 01PEEQW - 2014/2015
Optical axis Principal image plane Focal Plane Reversed image plane
π π′
F
π
3D object
Thin lens approximation ≈ Pinhole camera
Image formation and equations
12 ROBOTICS 01PEEQW - 2014/2015
image plane real object
p
image distance
- bject distance
q f
focal distance focal plane
( ) 1 1 1 ( ) ( ) p f f pq f p q f q f p q f − = ⇒ = + ⇒ + = − ( ) pf q p f = − if then p f q f ≈ ≫
the image plane is approx in the focal plane Lens equation
f
field of view angle
04/05/2015 di 23 7
Image formation
13 ROBOTICS 01PEEQW - 2014/2015
f
c
C
c
k
c
i
z
p
x
p
i
x π
F
π
i
P
i i i i
x P y ⇒ = p P
x y z
p P p p ⇒ = p
Transformations
Coordinate transformation between the world frame and the camera frame Projection of 3D point coordinates onto 2D image plane coordinates Coordinate transformation between possible choices of image coordinate frame
14 ROBOTICS 01PEEQW - 2014/2015
04/05/2015 di 23 8
Transformations
15
R
c
R
i
R
pix
R
π′
ROBOTICS 01PEEQW - 2014/2015
Image plane World frame Camera frame
1
c c c
= R t T 0 T
i pix
T 1 1 1
c i
f = T
Rescaling Optical correction
Reference frames
16
R
c
p
c
C
c
R
f
Optical axis Focal plane Image plane
i
i
i
R
i
j
pix
R
i
O π v u
F
π
ROBOTICS 01PEEQW - 2014/2015
c
k
c
i
c
j
World frame
P
i
P
c
′ p
pix
p
i
p
c i pix
→ → ← ←
translation translation+scale
R R R p
04/05/2015 di 23 9
Vector notation
17 ROBOTICS 01PEEQW - 2014/2015
i i i i pix pix
x y u v ⇔ = ⇔ = p p
T T
R R
in pixel units
in 3D in 2D
c c c c c c c c c c
x y x x y x x y x ⇔ = ⇔ = ′ ′ ′ ′ ⇔ = p p p
T T T
R R R
Perspective projection
18
f
i
P
P
c
C
c
k
c
i
z
p
x
p
i
x π′
F
π π
i
C
i
P
x i x z z i
p x p p f p f x = ⇒ = − −
ROBOTICS 01PEEQW - 2014/2015
Usually the negative sign is avoided considering the reversed image plane
All points give the same image P’
f
04/05/2015 di 23 10
Camera projections Perspective projection Orthographic projection
19 ROBOTICS 01PEEQW - 2014/2015
x i x i z z
p x p x f p f p = → =
z i x
p x p α ≈ ⇒ = if const
large compared to the distance from the camera small compared to the distance from the camera
The pixel height of similar subject is different if the distance from the camera varies a lot. On the left the persons have different pixel height while on the right they have approximately similar heights, since their distance from the camera is high and does not vary much
Perspective projection
20 ROBOTICS 01PEEQW - 2014/2015
x c y z
p P p p ⇒ = p 1 1
i i c z i c c c z
f f p f p = ⇒ = = p p p p P p
x z y z
p p i p c i p z z
f x f P f y p p f ′ ′ ⇒ = = p
i
p
perspective projection
0 1 1
i i c c c c
f f f f λ = = = p p p P p
arbitrary positive constant
04/05/2015 di 23 11
Perspective projection
21 ROBOTICS 01PEEQW - 2014/2015
Homogeneous coordinates
1
c x y z
p p p = p ɶ
T
1
i i i
x y = p ɶ
T
1 1 1
i i i z i i i z x i y i i c z i z i c c c z i c i c
x x f p f y y p p x f p p p y f p λ = ⇒ = ⇓ = = = = ⇓ = p p p p T p p T p ɶ ɶ ɶ ɶ Π Π Π
Homogeneous perspective/projection matrix
Perspective projection
22 ROBOTICS 01PEEQW - 2014/2015
0 1 0 0 1 1 1 0 1
c c i c c
f f = R t T 0 T Π
Canonical projection matrix
this is the ideal case
04/05/2015 di 23 12
Perspective projection
PP is studied by projective geometry PP preserves linearity; lines in 3D correspond to lines in 2D and viceversa PP does not preserve parallelism The intersection points in 2D of parallel lines in 3D define vanishing points (points infinitely far away)
23 ROBOTICS 01PEEQW - 2014/2015
Camera parameters
Intrinsic parameters: the parameters that link the pixel coordinates of an image point to the corresponding (metric) coordinates in the camera reference frames Extrinsic parameters: the parameter that define the location and
- rientation of the camera reference frame with respect to a
known world reference frame Camera calibration: the procedure to estimate these parameters
24 ROBOTICS 01PEEQW - 2014/2015
1
W W c c W c
= ⇒ R t T 0 T 6 parameters
04/05/2015 di 23 13
Camera intrinsic parameters
25 ROBOTICS 01PEEQW - 2014/2015
n columns m rows
x
s
x y
s s
i
R
pix
R
pix
i
i
i
i
j
pix
j u v 6 8 u v = = ( , )
aspect ratio
u v ,
x pix y
-
= c
in pixel units camera center in pixel units
x y
s s pix , in m
y
s
pix
c
Camera intrinsic parameters
Focal length f Transformation between pixel coordinates and camera coordinates Geometric distortion introduced by the optical lens systems
26 ROBOTICS 01PEEQW - 2014/2015
04/05/2015 di 23 14
Camera intrinsic parameters
Transformation between pixel coordinates and camera coordinates (scaling + translation) in homogeneous coordinates
27 ROBOTICS 01PEEQW - 2014/2015
x i x y i y
u s x
- v
s y
-
= + = + 1 1 1
i x x px pi i y y py pi i
x s s o x y s s o y − = − ⇓ p ɶ
Lens distorsion
28
Types
Barrel distortion Pincushion distortion
Radial distortion is modelled by a function D(r) that affects each point v in the projected plane relative to the principal point p, where D(r) is normally a non-linear scalar function and p is close to the midpoint
- f the projected image.
Barrel projections are characterized by a positive gradient of the distortion function, whereas pincushion by a negative gradient
( )
d
v D v p v p = − +
Radial distortion Non radial distortion (tangential)
ROBOTICS 01PEEQW - 2014/2015
04/05/2015 di 23 15
Lens distortion
29 ROBOTICS 01PEEQW - 2014/2015
( ) ( )
2 4 6 1 2 3 2 4 6 1 2 3
1 1
i id i id
x x k r k r k r y y k r k r k r = + + + = + + +
where
id id
x y ,
are the coordinates of the distorted image points
2 2 2 id id
r x y = +
is the square of the distance from the camera image center
1 2 3
k k k , ,
are intrinsic parameters Radial distortion is approximated by
Other image sensor errors
30
Errors are due to the imperfect orthogonality of pixel elements in CCD or CMOS sensors
ROBOTICS 01PEEQW - 2014/2015
04/05/2015 di 23 16
Optical sensors
Optical distance sensors
- Depth from focus
- Stereo vision
- ToF cameras
Motion and optical flow
31 ROBOTICS 01PEEQW - 2014/2015
Depth from focus
32
The method consists in measuring the distance of an object evaluating the focal length adjustment necessary to bring it in focus Short distance focus Medium distance focus Far distance focus
ROBOTICS 01PEEQW - 2014/2015
04/05/2015 di 23 17
Depth from focus
1 1 1 f D e = +
D f e δ focal plane image plane ( , , ) x y z ( , )
i i
x y
D
L ( ) 1 1 1 ( ) 2 ( ) ( )
D
L d e b x f d e s x + = − − + blur radius shape
33 ROBOTICS 01PEEQW - 2014/2015
Depth from focus
Near focusing Far focusing
34 ROBOTICS 01PEEQW - 2014/2015
04/05/2015 di 23 18
Stereo disparity
35
x z
( )
,
r r
x y
( )
, x y
ℓ ℓ
( )
, , x y z f b
left lens image plane right lens baseline (known)
ROBOTICS 01PEEQW - 2014/2015
Stereo disparity
36
( ) ( )
/ 2 / 2 , / 2 / 2
r r r r r r r
x x x b x b f z f z x x b f z x x x b x x y y y b y y f z b x x + − = = − = + = − + = − = −
ℓ ℓ ℓ ℓ ℓ ℓ ℓ
Idealized camera geometry for stereo vision
Disparity between two images → Depth computation
ROBOTICS 01PEEQW - 2014/2015
04/05/2015 di 23 19
Stereo vision
Distance is inversely proportional to disparity
closer objects can be measured more accurately
Disparity is proportional to baseline
For a given disparity error, the accuracy of the depth estimate increases with increasing baseline b However, as b is increased, some objects may appear in one camera, but not in the other
A point visible from both cameras produces a conjugate pair
Conjugate pairs lie on epipolar line (parallel to the x-axis for the arrangement in the figure above)
37 ROBOTICS 01PEEQW - 2014/2015
These two points are corresponding: how do you find them in the two images?
Stereo points correspondence
38
Left image Right image Disparity Right Left
ROBOTICS 01PEEQW - 2014/2015
04/05/2015 di 23 20
Epipolar lines
39
P
1
C
2
C π
1
q
2
q
1
e
2
e , R t
1
τ
2
τ
1
ℓ
2
ℓ
corresponding points stay on the epipolar lines epipolar lines these two points are known and fixed (they are called epipoles)
ROBOTICS 01PEEQW - 2014/2015
Stereo vision
Depth calculation
The key problem in stereo vision is how to optimally solve the correspondence problem Corresponding points lie on the epipolar lines
Gray-Level Matching
Match gray-level features on corresponding epipolar lines Zero-crossing of Laplacian of Gaussians is a widely used approach for identifying the same feature in the left and right images “Brightness” = image irradiance or intensity I(x,y) is computed and used as shown below
40 ROBOTICS 01PEEQW - 2014/2015
04/05/2015 di 23 21
Stereo vision
L R VERTICAL FILTERED IMAGES Confidence image Depth image
41 ROBOTICS 01PEEQW - 2014/2015
Time of Flight (ToF) Cameras – 1
Range imaging systems based on the known speed of light They measure the time-of-flight, i.e., the time from the emission to the return of the signal The measurement is performed for each point of the image (different from Lidars) The distance resolution is 1 cm (larger than Lidars’) The simplest version of a time-of-flight camera uses light pulses
42 ROBOTICS 01PEEQW - 2014/2015
04/05/2015 di 23 22
Time of Flight (ToF) Cameras – 2
The illumination is switched on for a very short time, the light pulse illuminates the scene and is reflected by the objects The camera lens gathers the reflected light and images it onto the sensor plane Depending on the distance, the incoming light experiences a delay The pulse width of the illumination determines the maximum range of the camera can handle Only with some special LEDs or lasers is it possible to generate such short pulses
43 ROBOTICS 01PEEQW - 2014/2015
Time of Flight (ToF) Cameras – 3
The single pixel consists of a photo diode that converts the incoming light into a current In analog solutions fast switches are connected to the photo diode, which sends the current to one of two memory elements (capacitors) that act as summation elements In digital solutions a time counter, running at several gigahertz, is connected to each pixel and stops counting when light is sensed
44 ROBOTICS 01PEEQW - 2014/2015
04/05/2015 di 23 23
Time of Flight (ToF) Cameras – 4
The pixel uses two switches G1 and G2 and two memory elements S1 and S2 The switches are controlled by a pulse with the same length as the light pulse, where the control signal of switch G2 is delayed by exactly the pulse width Depending on the delay, only part of the light pulse is sampled through G1 in S1, the other part is stored in S2 Depending on the distance, the ratio between S1 and S2 changes as depicted in the drawing Because only small amounts of light hit the sensor within 50 ns, not only one but several thousand pulses are sent out (repetition rate tR) and gathered, thus increasing the signal to noise ratio
45 ROBOTICS 01PEEQW - 2014/2015
Analog solution
Time of Flight (ToF) Cameras – 5
After the exposure, the pixel is read out and the signals S1 and S2 are measured. The length of the light pulse is known, and the distance can be calculated as In the presence of background light, the memory elements receive an additional part of the signal that disturbs the distance measurement To eliminate the background part of the signal, the whole measurement can be performed a second time with the illumination switched off
46 ROBOTICS 01PEEQW - 2014/2015
1 2 2 1 2 S D ct S S = +
04/05/2015 di 23 24
Time of Flight (ToF) Cameras – 6
47 ROBOTICS 01PEEQW - 2014/2015
Time of Flight (ToF) Cameras – 5
Pros
Simplicity
In contrast to stereo vision or triangulation systems, the whole system is very compact: the illumination is placed just next to the lens, whereas the other systems need a certain minimum base line. In contrast to laser scanning systems, no mechanical moving parts are needed
Efficient distance algorithm
It is very easy to extract the distance information out of the output signals of the TOF sensor, therefore this task uses only a small amount of processing power, again in contrast to stereo vision, where complex correlation algorithms have to be
- implemented. After the distance data has been extracted, object detection, for
example, is also easy to carry out because the algorithms are not disturbed by patterns on the object
Speed
Time-of-flight cameras are able to measure the distances within a complete scene with one shot. As the cameras reach up to 160 frames per second, they are ideally suited to be used in real-time applications
48 ROBOTICS 01PEEQW - 2014/2015
04/05/2015 di 23 25
Time of Flight (ToF) Cameras – 6
Cons
Background light
When using CMOS or other integrating detectors or sensors that use visible or near visible light (400 nm - 700 nm), although most of the background light coming from artificial lighting or the sun is suppressed, the pixel still has to provide a high dynamic
- range. The background light also generates electrons, which have to be stored. For
example, the illumination units in many of today's TOF cameras can provide an illumination level of about 1 watt. The Sun has an illumination power of about 50 watts per square meter after the optical band-pass filter. Therefore, if the illuminated scene has a size of 1 square meter, the light from the sun is 50 times stronger than the modulated signal. For non-integrating TOF sensors that do not integrate light over time and are using near-infrared detectors (InGaAs) to capture the short laser pulse, direct viewing of the sun is a non-issue. Such TOF sensors are used in space applications and in consideration for automotive applications
49 ROBOTICS 01PEEQW - 2014/2015
Time of Flight (ToF) Cameras – 7
Cons
Interference
In certain types of TOF devices, if several time-of-flight cameras are running at the same time, the TOF cameras may disturb each other's measurements.
Multiple reflections
In contrast to laser scanning systems, where only a single point is illuminated at
- nce, the time-of-flight cameras illuminate a whole scene. On a phase difference
device, due to multiple reflections, the light may reach the objects along several paths and therefore, the measured distance may be greater than the true
- distance. Direct TOF imagers are vulnerable if the light is reflecting from a specular
- surface. There are published papers that outline the strengths and weaknesses of
the various TOF devices and approaches
50 ROBOTICS 01PEEQW - 2014/2015
04/05/2015 di 23 26
Optical flow
51
Optical flow is the pattern of apparent motion of objects, surfaces, and edges in successive scenes caused by the relative motion between the camera and the scene Optical flow techniques
motion detection
- bject segmentation
time-to-collision motion compensated encoding stereo disparity measurement
ROBOTICS 01PEEQW - 2014/2015
Optical flow
52
The optical flow methods try to calculate the motion between two image frames which are taken at times t and t + δt at every voxel position. These methods are called differential since they are based on local Taylor series approximations of the image signal, i.e., they use partial derivatives with respect to the spatial and temporal coordinates
A voxel (volumetric + pixel) is a volume element representing a value on a regular grid in 3D space ( , , ) ( , , ) ( , , ) ... I x y t I x x y y t t I I I I x y t x y t x y t δ δ δ δ δ δ = + + + ∂ ∂ ∂ = + + + + ∂ ∂ ∂ I I I x y t x y t δ δ δ ∂ ∂ ∂ + + = ∂ ∂ ∂
ROBOTICS 01PEEQW - 2014/2015
04/05/2015 di 23 27
Optical flow
53
T t x y
V V V I I I x y t I I ∂ ∂ ∂ + + = ∂ ∂ ∂ ∇ ⋅ = −
This problem is known as the aperture problem of the
- ptical flow algorithms
There is only one equation in two unknowns and therefore cannot be solved To find the optical flow another set of equations is needed, given by some additional constraint. All optical flow methods introduce additional conditions for estimating the actual flow.
ROBOTICS 01PEEQW - 2014/2015
Optical flow
Lucas–Kanade Optical Flow Method
A two-frame differential methods for motion estimation
1 1 2 2
1 1 1 1 2 2 2 2 1
( )
n n
x x y y t t x y x x y y t t x y x y xn yn xn x yn y t t T T
I V I V I I I I I V I V I I I I V V I I I V I V I I
−
+ = − + = − ⇒ = − + = − = ⇒ = Ax b x A A A b ⋯ ⋯ ⋯ ⋯
The additional constraints needed for the estimation of the flow are introduced in this method by assuming that the flow is constant in a small window of size with which is centered at pixel Numbering the pixels as a set of equations can be found
54
,
x y
V V 1 m > ( , ) x y
2
1, ,n m = …
ROBOTICS 01PEEQW - 2014/2015
04/05/2015 di 23 28
Laplacian
The Laplacian is a 2D isotropic measure of the 2-nd spatial derivative of an image The Laplacian of an image highlights regions of rapid intensity change and is often used for edge detection The Laplacian is often applied to an image that has first been smoothed with something approximating a Gaussian smoothing filter, in order to reduce its sensitivity to noise The operator normally takes a single gray level image as input and produces another gray level image as output
55 ROBOTICS 01PEEQW - 2014/2015
Laplacian
The Laplacian L(x,y) of an image with pixel intensity values I(x,y) is given by:
2 2 2 2
( , ) I I L x y x y L P I ∂ ∂ = + ∂ ∂ = ⊗
1
1 1 4 1 1 P = − 1 2 1 1 2 4 2 16 1 2 1 G =
2
1 1 1 1 8 1 1 1 1 P − − − = − − − − −
56
Convolution
- perator
ROBOTICS 01PEEQW - 2014/2015
04/05/2015 di 23 29
Convolution
Convolution is a simple mathematical operation which is fundamental to many image processing operators Convolution “multiplies together” two arrays of numbers, generally of different sizes, but of the same dimensionality, to produce a third array of numbers of the same dimensionality This can be used in image processing to implement operators whose output pixel values are simple linear combinations of certain input pixel values In an image processing, one of the input arrays is normally just the gray level image. The second array is usually much smaller, and is also two-dimensional (although it may be just a single pixel thick), and is known as the kernel
57 ROBOTICS 01PEEQW - 2014/2015
Convolution
58 ROBOTICS 01PEEQW - 2014/2015
04/05/2015 di 23 30
Convolution matrix
11 12 13 14 15 16 17 18 19 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 41 42 43 44 45 46 47 48 49 51 52 53 54 55 56 57 58 59 61 62 63 64 65 66 67 68 69 11 12 13 21 22 23
( , ) I i j ( , ) K i j
IMAGE KERNEL If the image has M rows and N columns, and the kernel has m rows and n columns, then the size of the output image will have
M - m + 1 rows, and N - n + 1 columns
(6 2 1) (9 3 1) 5 7 − + × − + = ×
59 ROBOTICS 01PEEQW - 2014/2015
Convolution product
k11 k12 k13 k21 k22 k23 k11 k12 k13 k21 k22 k23
1 1
( , ) ( 1, 1) ( , ) 1, ,( 1); 1, ,( 1)
m n k l
O i j I i k j l K k l i M m j N n
= =
= + − + − = − + = − +
∑∑
… …
( , ) O i j
60 ROBOTICS 01PEEQW - 2014/2015
04/05/2015 di 23 31
Stereo vision
Zero crossing of Laplacian of Gaussian Identification of features that are stable and match well Laplacian of intensity image Step/edge detection of noisy image: filter through Gaussian smoothing
61 ROBOTICS 01PEEQW - 2014/2015
Edge detection
62 ROBOTICS 01PEEQW - 2014/2015
04/05/2015 di 23 32
Natural vision
63
Retina
ROBOTICS 01PEEQW - 2014/2015
Natural vision
64
fMRI shows the brain areas interested by neural activity associated to vision Optic chiasm
ROBOTICS 01PEEQW - 2014/2015