CS 4495 Computer Vision 3D Perception Kelsey Hawkins Robotics 3D - - PowerPoint PPT Presentation

cs 4495 computer vision 3d perception
SMART_READER_LITE
LIVE PREVIEW

CS 4495 Computer Vision 3D Perception Kelsey Hawkins Robotics 3D - - PowerPoint PPT Presentation

3D Perception CS 4495 Computer Vision K. Hawkins CS 4495 Computer Vision 3D Perception Kelsey Hawkins Robotics 3D Perception CS 4495 Computer Vision K. Hawkins Motivation What do animals, people, and robots want to do with vision?


slide-1
SLIDE 1

3D Perception CS 4495 Computer Vision – K. Hawkins

Kelsey Hawkins Robotics

CS 4495 Computer Vision 3D Perception

slide-2
SLIDE 2

3D Perception CS 4495 Computer Vision – K. Hawkins

Motivation

  • What do animals, people, and robots want to do with vision?
  • Detect and recognize objects/landmarks

Is that a banana or a snake? A cup or a plate?

  • Find location of objects with respect to themselves

Want to grasp fruit/tool, where should I put my body/arm?

Changes in elevation: steps, rocks, inclined planes

  • Determine shape

What is the physical 3D structure of this object?

Where does an object begin and the background begin?

  • Find obstacles and map the environment

How do I get my body/arm from A to B without hitting things?

  • Others – tracking, dynamics, etc.
slide-3
SLIDE 3

3D Perception CS 4495 Computer Vision – K. Hawkins

Weaknesses of Images

Color Inconsistency Surface Geometry

slide-4
SLIDE 4

3D Perception CS 4495 Computer Vision – K. Hawkins

Weaknesses of Monocular Vision

Scale

Lack of texture

Background-foreground similarity

slide-5
SLIDE 5

3D Perception CS 4495 Computer Vision – K. Hawkins

Potential solution: 3D Sensing

pointclouds.org

slide-6
SLIDE 6

3D Perception CS 4495 Computer Vision – K. Hawkins

Types of 3D Sensing

  • Passive 3D sensing

– Work with naturally occurring light – Exploit geometry or known properties of

scenes

  • Active 3D sensing

– Project light or sound out into the

environment and see how it reacts

– Encode some pattern which can be found

in the sensor

slide-7
SLIDE 7

3D Perception CS 4495 Computer Vision – K. Hawkins

Passive – 3D Sensors

Stereo Rigs Shape from focus

Nayar, Watanabe, and Noguchi 1996

slide-8
SLIDE 8

3D Perception CS 4495 Computer Vision – K. Hawkins

Active – Photometric Stereo

slide-9
SLIDE 9

3D Perception CS 4495 Computer Vision – K. Hawkins

Active – Time of Flight

LIDAR / Laser / Range finder

  • Bounce signal off of a surface, record time to come

back, X=V*t/2

SONAR / Sound / Transceiver

slide-10
SLIDE 10

3D Perception CS 4495 Computer Vision – K. Hawkins

Active - Structured Light

  • Remember stereo?
  • Let's replace the

camera with a projector

  • Instead of looking

for the same features in both image, we look for a known feature we've projected on the scene

slide-11
SLIDE 11

3D Perception CS 4495 Computer Vision – K. Hawkins

Active – Structured Light

Zhang, Li et. al. "Rapid shape acquisition..."

slide-12
SLIDE 12

3D Perception CS 4495 Computer Vision – K. Hawkins

Active – Infrared Structured Light

slide-13
SLIDE 13

3D Perception CS 4495 Computer Vision – K. Hawkins

How the Kinect works

PrimeSense patent 2010/0290698

  • Cylindrical lens

– Only focuses light in one direction

slide-14
SLIDE 14

3D Perception CS 4495 Computer Vision – K. Hawkins

How the Kinect works

PrimeSense patent No. 20100290698

slide-15
SLIDE 15

3D Perception CS 4495 Computer Vision – K. Hawkins

How the Kinect works

PrimeSense patent No. 20100290698

slide-16
SLIDE 16

3D Perception CS 4495 Computer Vision – K. Hawkins

How the Kinect works

PrimeSense patent 2010/0290698

slide-17
SLIDE 17

3D Perception CS 4495 Computer Vision – K. Hawkins

How the Kinect works

PrimeSense patent 2010/0290698

Psuedo-random speckle pattern

slide-18
SLIDE 18

3D Perception CS 4495 Computer Vision – K. Hawkins

2D vs. 3D Perception

Analysis Tools

2D 3D

Representation Image (u,v)

  • Depth image (u,v,d)
  • Point cloud (x,y,z)

1st order differential geometry Image gradients Surface normals 2nd order differential geometry Second moment matrix Principle curvature Corner detection Harris image Surface variation Feature extraction HOG

  • Point Feature

Histograms

  • Spin Images

Geometric model fitting Hough transform Clustering + RANSAC Alignment SSD window filter Iterative Closest Point (ICP)

slide-19
SLIDE 19

3D Perception CS 4495 Computer Vision – K. Hawkins

Depth Images

  • Advantages

– Dense representation – Gives intuition about occlusion and free space – Depth discontinuities are just edges on the

image

  • Disadvantages

– Viewpoint dependent, can't merge – Doesn't capture physical geometry – Need actual 3D locations

slide-20
SLIDE 20

3D Perception CS 4495 Computer Vision – K. Hawkins

Point Clouds

  • R. Rusu's PCL Presentation
  • Take every depth pixel and put it out in the world
  • What can this representation tell us?
  • What information do we lose?
slide-21
SLIDE 21

3D Perception CS 4495 Computer Vision – K. Hawkins

Point Clouds

  • Advantages

– Viewpoint independent – Captures surface geometry – Points represent physical locations

  • Disadvantages

– Sparse representation – Lost information about free space and unknown

space

– Variable density based on distance from sensor

  • R. Rusu's PCL Presentation
slide-22
SLIDE 22

3D Perception CS 4495 Computer Vision – K. Hawkins

Point Clouds and Surfaces

  • Point clouds are sampled from the surfaces of the
  • bjects perceived
  • The concept of volume is inferred, not perceived
slide-23
SLIDE 23

3D Perception CS 4495 Computer Vision – K. Hawkins

Surfaces

  • Let's say we'd like to learn the “geometry” around

a point in our cloud

  • What is the simplest surface representation we

could use to approximate the surface about a point?

  • Tangent plane

– Defined by normal

  • First-order approximation
slide-24
SLIDE 24

3D Perception CS 4495 Computer Vision – K. Hawkins

Surfaces

  • To understand how we can characterise surfaces,

we can look to differential geometry

  • A surface is 2D manifold in 3D space
  • Parametric representation
  • How u and v are “oriented” with respect to the

surface is irrelevant f :ℝ2→ℝ3

f(u,v)=(x , y ,z)

slide-25
SLIDE 25

3D Perception CS 4495 Computer Vision – K. Hawkins

Surfaces

u v

slide-26
SLIDE 26

3D Perception CS 4495 Computer Vision – K. Hawkins

Surfaces

u v

slide-27
SLIDE 27

3D Perception CS 4495 Computer Vision – K. Hawkins

Surfaces

u v

slide-28
SLIDE 28

3D Perception CS 4495 Computer Vision – K. Hawkins

Surface Normals

  • Want to estimate this function
  • What can we do to estimate this

function?

f(u,v) f(u,v)≈f(u0, v0)+[u−u0, v−v0][ ∂f ∂u (u0,v 0) ∂f ∂ v (u0, v0)]

Taylor Series 1st order approximation at (u0,v0)

slide-29
SLIDE 29

3D Perception CS 4495 Computer Vision – K. Hawkins

Surface Normals

  • We have a problem though...
  • Don't have basis, infinite exist!
  • Take a sample of 3D points we believe

lie on around

(u, v) f(u,v) (u0,v0)

A=[ f(u1,v1) ⋮ f(un,vn)] =[ x1 y1 z1 ⋮ xn yn zn] =[ u1−u0 v1−v 0 ⋮ un−u0 v n−v0][ ∂ f ∂ u (u0,v0)

T

∂ f ∂ v (u0,v0)

T]

An=0

  • Find n such that
  • We've done this before (last eigenvector)
slide-30
SLIDE 30

3D Perception CS 4495 Computer Vision – K. Hawkins

Surface Normals

  • This n (the normal) is perpendicular to

both partials, regardless of basis choice

  • Surface normal is a first order

approximation of the surface at the point invariant to basis choice

∂f ∂u ⊥n, ∂f ∂ v ⊥n

An=[ u1−u0 v1−v 0 ⋮ un−u0 v n−v0][ ∂ f ∂ u

T

∂ f ∂ v

T]

n=0⇔[ ∂f ∂u

T

∂f ∂ v

T]

n=0⇔ ∂ f ∂ u⋅n=0 , ∂ f ∂ v⋅n=0

slide-31
SLIDE 31

3D Perception CS 4495 Computer Vision – K. Hawkins

Surface Normals

  • Size of patch is like width of

Gaussian in image gradient calculation

  • We can use them to find

planes

slide-32
SLIDE 32

3D Perception CS 4495 Computer Vision – K. Hawkins

Principal Curvature

  • Second order approximation
slide-33
SLIDE 33

3D Perception CS 4495 Computer Vision – K. Hawkins

Surface Variation

A=[ f(u1,v1) ⋮ f(un,vn)] =[ x1 y1 z1 ⋮ xn yn zn]

Normal

A=U S V

T=U[

s2 s1 s0] [v2v1v0]

T

Principal Curvatures

surface variation= s0

2

s0

2+s1 2+s2 2

  • This is equivalent to finding the eigenvalues/vectors
  • f the covariance matrix A

T A

slide-34
SLIDE 34

3D Perception CS 4495 Computer Vision – K. Hawkins

Normals / Surface Variation Demo

slide-35
SLIDE 35

3D Perception CS 4495 Computer Vision – K. Hawkins

Feature Extraction

  • Suppose we want a denser description of the local

surface function

  • Want to find unique patches of surface geometry
  • What type of invariance do we need?
  • Need viewpoint invariance

– Translation + orientation – Color and texture come automatically!

slide-36
SLIDE 36

3D Perception CS 4495 Computer Vision – K. Hawkins

Point Feature Histograms

  • Remember SIFT?
  • We're going to use roughly the same idea

– Use the normal at the point to establish a dominant

  • rientation

– Build a histogram of the orientations of normals in the

general region with respect to the original

slide-37
SLIDE 37

3D Perception CS 4495 Computer Vision – K. Hawkins

Point Feature Histograms

  • At a point, take a ball of

points around it

  • For every pair of points,

find the relationship between the two points and their normals

  • Must be frame

independent

  • R. Rusu's Thesis
slide-38
SLIDE 38

3D Perception CS 4495 Computer Vision – K. Hawkins

Point Feature Histograms

  • Reduce to 4 variables

(x1, y1, z1,nx1,ny1,nz1) (x2, y2, z2,nx2,n y2,nz2)

  • R. Rusu's Thesis
slide-39
SLIDE 39

3D Perception CS 4495 Computer Vision – K. Hawkins

Point Feature Histograms

  • Find these for variables for every pair in the ball
  • Build a 5x5x5x5 histogram of the variables

Often the distance variable is excluded

In this case, we have a 125-long feature vector

  • Use this just like a SIFT feature descriptor
  • Usually, a sped-up version called Fast Point Feature

Histograms is used for real-time applications

slide-40
SLIDE 40

3D Perception CS 4495 Computer Vision – K. Hawkins

Spin Images

  • Rotate plane about normal of a point, project all

points onto surface, build a histogram

  • A. Johnson's 1997 Thesis
slide-41
SLIDE 41

3D Perception CS 4495 Computer Vision – K. Hawkins

Spin Images

  • A. Johnson's 1997 Thesis
slide-42
SLIDE 42

3D Perception CS 4495 Computer Vision – K. Hawkins

Comparison of 3D Descriptors

Alexandre, L., 3D Descriptors for Object and Category Recognition: a Comparative Evaluation

slide-43
SLIDE 43

3D Perception CS 4495 Computer Vision – K. Hawkins

Comparison of 3D Descriptors

Alexandre, L., 3D Descriptors for Object and Category Recognition: a Comparative Evaluation

slide-44
SLIDE 44

3D Perception CS 4495 Computer Vision – K. Hawkins

Alignment

  • PFH correspondences + RANSAC can be good at

estimating an initial alignment

  • Often the alignment is off by a little bit
  • Or perhaps we already have a good estimate of the

alignment of two point clouds from some other source?

Viewpoint is roughly in the same place

Use SIFT in 2D

  • How can we remove that last bit of error?
slide-45
SLIDE 45

Aligning 3D Data

Slides stolen from Ronen Gvili

slide-46
SLIDE 46

Corresponding Point Set Alignment

Let M be a model point set.

Let S be a scene point set. We assume :

1.

NM = NS.

2.

Each point Si correspond to Mi .

Slides stolen from Ronen Gvili

slide-47
SLIDE 47

Corresponding Point Set Alignment

The MSE objective function : The alignment is :

∑ ∑

= =

− − = − − =

S S

N i T i R i S N i i i S

q s q R m N q f Trans s Rot m N T R f

1 2 1 2

) ( 1 ) ( ) ( 1 ) , (

) , ( ) , , ( S M d trans rot

mse

Φ =

Slides stolen from Ronen Gvili

slide-48
SLIDE 48

Aligning 3D Data

 If correct correspondences are known, can

find correct relative rotation/translation

Slides stolen from Ronen Gvili

slide-49
SLIDE 49

Aligning 3D Data

 How to find correspondences: User input?

Feature detection? Signatures?

 Alternative: assume closest points correspond

Slides stolen from Ronen Gvili

slide-50
SLIDE 50

Aligning 3D Data

 How to find correspondences: User input?

Feature detection? Signatures?

 Alternative: assume closest points correspond

Slides stolen from Ronen Gvili

slide-51
SLIDE 51

Aligning 3D Data

 Converges if starting position “close enough“

Slides stolen from Ronen Gvili

slide-52
SLIDE 52

The Algorithm

Init the error to ∞ Calculate correspondence Calculate alignment Apply alignment Update error If error > threshold Y = CP(M,S),e (rot,trans,d)

S`= rot(S)+trans

d` = d Slides stolen from Ronen Gvili

slide-53
SLIDE 53

Convergence Theorem

 The ICP algorithm always converges

monotonically to a local minimum with respect to the MSE distance objective function.

Slides stolen from Ronen Gvili

slide-54
SLIDE 54

3D Perception CS 4495 Computer Vision – K. Hawkins

RANSAC Segmentation

  • RANSAC is a very general algorithm

Have some model we want to fit

Some reasonable percentage of the dataset fits the model

Find the best model by subsampling, fitting, reprojecting, and evaluating the model

  • Plane model:
  • A limited cylinder model:

ax+by+cz+d=0 (x−a)

2+( y−b) 2=r 2

slide-55
SLIDE 55

3D Perception CS 4495 Computer Vision – K. Hawkins

RANSAC Cylinder Segmentation

pointclouds.org

slide-56
SLIDE 56

3D Perception CS 4495 Computer Vision – K. Hawkins

Point Cloud Software

  • Point Cloud Library (PCL)

– http://pointclouds.org

  • Robot Operating System (ROS)

– Framework for building systems – http://www.ros.org

  • Drivers for Kinect and other PrimeSense sensors

– http://www.ros.org/wiki/openni_launch