SLIDE 1 3D Photography: Features & Correspondences
Kevin Köser
Spring 2013
http://www.cvg.ethz.ch/teaching/2013spring/3dphoto/
SLIDE 2 Feb 18 Introduction Feb 25 Geometry, Camera Model, Calibration Mar 4 Features, Tracking/Matching Mar 11 Project Proposals by Students Mar 18 Epipolar Geometry Mar 25 Stereo Vision Apr 1 Easter Apr 8 Structure from Motion / SLAM Apr 15 Project Updates (Sechseläuten in afternoon) Apr 22 Active Ranging, Structured Light Apr 29 Volumetric Modeling May 6 Mesh-based Modeling May 13 Shape-from-X May 20 Pentecost / White Monday May 27 Student Project Demo Day
Schedule (tentative)
SLIDE 3 2D-2D 2D-3D 2D-3D
mi mi+1 M
Correspondences are at the heart of 3D reconstruction from images Today: Features & Correspondences
SLIDE 4 Feature matching vs. tracking
Extract features independently and then match by comparing descriptors Extract features in first images and then try to find same feature back in next view
What is a good feature? Image-to-image correspondences are key to passive triangulation-based 3D reconstruction
SLIDE 5
Compare intensities pixel-by-pixel
Comparing image regions
I(x,y) I´(x,y) Sum of Square Differences
Dissimilarity measures
SLIDE 6 Feature points
- Required properties:
- Well-localized
- Stable across views, „repeatable“
(i.e. same 3D point should be extracted as feature for neighboring viewpoints)
SLIDE 7 Feature point extraction
homogeneous edge corner Find points (local image patches) that differ as much as possible from all neighboring points
SLIDE 8 Feature point extraction
- Approximate SSD for small displacement Δ
- Image difference, square difference for pixel
- SSD for window
SLIDE 9 Feature point extraction
homogeneous edge corner
Find points for which the following is maximum i.e. maximize smallest eigenvalue of M
SLIDE 10 Harris corner detector
- Only use local maxima, subpixel accuracy through
second order surface fitting
- Select strongest features over whole image and over
each tile (e.g. 1000/image, 2/tile)
- Use small local window:
- Maximize „cornerness“:
SLIDE 11 Simple matching
- for each corner in image 1 find the corner in
image 2 that is most similar and vice-versa
- Only compare geometrically compatible points
- Keep mutual best matches
What transformations does this work for?
SLIDE 12
Compare intensities pixel-by-pixel
Comparing image regions
I(x,y) I´(x,y) Sum of Square Differences
Dissimilarity measures
SLIDE 13
Compare intensities pixel-by-pixel
Comparing image regions
I(x,y) I´(x,y) Zero-mean Normalized Cross Correlation
Similarity measures
SLIDE 14 Feature matching: example
0.96
0.19
0.75
0.51 0.72
0.73 0.15
0.49 0.16 0.79 0.21 0.08 0.50
0.28 0.99
1 5 2 4 3 1 5 2 4 3
What transformations does this work for? What level of transformation do we need?
SLIDE 15 Wide baseline matching
- Requirement to cope with larger
variations between images
- Translation, rotation, scaling
- Foreshortening
- Non-diffuse reflections
- Illumination
geometric transformations photometric changes
SLIDE 16 Invariant detectors
Scale invariant Affine invariant
(approximately invariant w.r.t. perspective/viewpoint)
Rotation invariant
SLIDE 17 2D Transformations of a Local Patch
Block Matching
e.g. MSER In practise hardly observable in small patches !
SLIDE 18
Example: Find Correspondences between these images using the MSER Detector [Matas„02]
SLIDE 19
MSER Features
Local regions, not points !
SLIDE 20 Extremal Regions:
Surrounding
SLIDE 21 Extremal Regions:
Surrounding
SLIDE 22 Regions: Connected Pixels at some threshold
- Region Size = # Pixels
- Maximally stable: Size Constant near some
threshold
SLIDE 23
A Sample Feature
SLIDE 24
„T“ is maximally stable with respect to surrounding
SLIDE 25
- Compute „center of gravity“
- Compute Scatter (PCA / Ellipsoid)
SLIDE 26
Different Images: Different positions/sizes/shapes
Ellipse abstracts from pixels ! Geometric representation: position/size/shape
SLIDE 27
Still: How to compare ?
Idea: Normalize to „Default“ Position/Size/Shape ! e.g. Circle of Radius 16 Pixels !
SLIDE 28
Idea: Normalize to „Default“ Position/Size/Shape ! Ok, but 2D orientation ?
SLIDE 29
- Idea (Lowe„99): Run over all Pixels:
Chart Local Gradient Orientation in Histogram
- Find Dominant Orientation in Histogram
- Rotate Local Patch into Dominant Orientation
SLIDE 30
Each normalized patch obtained from single image !
SLIDE 31 Wrap-up MSER
- detect sets of pixels brighter/darker than surr.
- fit elliptical shape to pixel set
- warp image so that ellipse becomes circle
- rotate to dominant gradient direction [other
constructions posible as well] Affine normalization of feature leads to similar patches in different views ! Two MSERegions: Are they in correspondence ?
SLIDE 32 Traditional Matching Approach: Compare Regions (Sum of Squared Differences)
- Small Misalignment
- Brightness Change
More Tolerant Comparison ?
SLIDE 33 SIFT Descriptor [Lowe„99]:
- Brightness offset: use only gradients
!
For each Sector:
- Store Orientations of Gradients !
Gradient Magnitude
Partition into Sectors
Gradient Orientation/Magnitude
SLIDE 34 Quantize Gradient Orientation, e.g. 45° Steps
Orientation Histogram per Sector Gradient Orientation/Magnitude
Orientation Histogram (Magnitude as Weight) m Sectors with n Orientations: (m·n) values Construct Vector 35 12 10 25 … 29
SLIDE 35 35 12 10 25 … 29
0.12 0.04 0.03 0.08
…
0.10
Normalize (Suppresses Changing Contrast Effects) … … Summed Gradient Magnitudes for Different Sectors and Orientations „SIFT Descriptor“: 128 Bytes Memory Comparison 11x11 Patch: Raw Grey Values = 121 Bytes
SLIDE 36
Wrap Up: Normalized Patch Comparison vs. Descriptor Usage of Gradients: Intensity Offset Compensation Subdivision into Sectors / Per-Sector Histogram: Small Alignment Error Compensation Normalization of Histogram Vector: Image Contrast Compensation But most important: Avoid sudden „descriptor jumps“
SLIDE 37
Classical Histogram (Quantization 45°): 22° quantized/rounded to 0° 23° quantized/rounded to 45° Small Differences can lead to different bins ! Feature Position, Size, Shape, Orientation uncertain, Image Content noisy ! Descriptor MUST tolerate this (no sudden changes !) Solution: „Soft-Binning“ !
SLIDE 38 Histogram (Quantization 45°): 20° 0° 45° 90° … Classical (closest bin)
1.0
22°
2.0
If orientation is 3° different, all measurements go to second bin ! => Sudden Change in Histogram from (2 0 0 0) to (0 2 0 0)
SLIDE 39 Histogram (Quantization 45°): 20° 0° 45° 90° … Soft-Binning
0.56
22°
1.07
If orientation is 3° different, descriptor changes only gradually !
0.44 0.93 0.56 0.44
Soft Weights: „Bin Correctness“
SLIDE 40 Wrap-up Detector:
- Find interesting regions (position/size/shape)
- Assign dominant gradient orientation
- Normalize regions
Descriptor:
- Compute „signature“ upon normalized region
- Behave smoothly in presence of distortions:
brightness changes / normalization inaccuracies
SLIDE 41
How to find correspondences ? For each Region: 128-dim. Descriptor
SLIDE 42 Matching Scenario I Two images in a dense foto sequence:
- think about maximum movement d (e.g. 50 pixel)
- Search in a window +/- d of old position
- Compare descriptors, choose most similar
SLIDE 43 Matching Scenario II Two arbitrary images / Wide baseline
- Compare every descriptor with every other (e.g. GPU)
- OR: Find small set of matches, predict others
- OR: Find nearest neighbor in descriptor space
SLIDE 44 Searching Descriptor Space Key Ideas:
- Each descriptor consists of 128 numbers
Imagine vector from IR128
- Correspondending descriptors: not far apart !
- Arrange all descriptors of image 1 in kd-tree
(imagine „octree“ but with more dims)
- For each descriptor of image 2:
Find (approximate) nearest neighbor in tree
SLIDE 45
Searching Descriptor Space „Learn“ important dimensions of 128D space for a given scene, e.g. PCA or LDA Project descriptors to important dimensions, use kd-tree
SLIDE 46 Matching Techniques Spatial Search Window:
- Requires/exploits good prediction
- Can avoid far away similar-looking features
- Good for sequences
Descriptor Space:
- Initial tree setup
- Fast lookup for huge amounts of features
- More sophisticated outlier detection required
- Good for asymmetric (offline/online) problems,
registration, initialization, object recognition, wide baseline matching
SLIDE 47 Correspondence Verification Features have only very local view => Mismatches How to detect ?
- Discard Matches with low similarity
- Delete „non-distinctive“ features (those with close
match in same image or with similar 2nd best match)
- Check for bi-directional consistency
- Geometric verification, e.g. RANSAC
SLIDE 48
“conjugate rotation property” equivalent to l ) determines “projective” scale factor when going to
Object Detection / Pose Estimation using single MSER Affine feature: position(2x), shape(3x), orientation(1x) 6 degree of freedom, more than 2D simple point !
SLIDE 49 2D Transformations of a Local Patch
Block Matching
e.g. MSER e.g. SIFT
SLIDE 50 Lowe‟s SIFT features
Detector + Descriptor Recover features with position, orientation and scale
(Lowe, ICCV99)
SLIDE 51 Position
- Look for strong responses of DOG filter
(Difference-Of-Gaussian)
- Only consider local maxima
SLIDE 52 Scale
- Look for strong responses of DOG filter
(Difference-Of-Gaussian) over scale space
- Only consider local maxima in both
position and scale
- Fit quadratic around maxima for subpixel
SLIDE 53
Minimum contrast and “cornerness”
SLIDE 54 Orientation
- Create histogram of local
gradient directions computed at selected scale
- Assign canonical
- rientation at peak of
smoothed histogram
- Each key specifies stable
2D coordinates (x, y, scale, orientation)
SLIDE 55 SIFT descriptor
- Thresholded image gradients are sampled over
16x16 array of locations in scale space
- Create array of orientation histograms
- 8 orientations x 4x4 histogram array = 128
dimensions
SLIDE 56
SLIDE 57
- Affine feature evaluation + binaries:
http://www.robots.ox.ac.uk/~vgg/research/affine/
- SIFT + MSER + some tools:
http://vlfeat.org
http://www.vision.ee.ethz.ch/~surf/
http://www.cs.unc.edu/~ccwu/siftgpu/
- DAISY (dense descriptors)
http://cvlab.epfl.ch/~tola/daisy.html
http://svr-www.eng.cam.ac.uk/~er258/work/fast.html
Some Feature Resources
Check also opencv + try to google
SLIDE 58
- BRIEF[Calonder10]: binary descriptor
(tests=position a darker than b), compare descriptors by XOR (Hamming) + POPCNT
- RIFF[Takacs10]: CENSURE + gradients
tangential/radial
- ORB[Rublee11] FAST+orientation
- BRISK[Leutenegger11] FAST+scale+BRIEF
- FREAK[Alahi12] FAST + “daisy”-BRIEF
- Lucid[Ziegler12]: “sort intensities”
- D-BRIEF[Trzcinski12]:Box-Filter+learned
projection+BRIEF
- LDA HASH[Strecha12]: binary tests
- n descriptor
Recent Variants and Accelerations
(much faster, but this usually comes at a price)
SLIDE 59 Features: local normalization + robust descriptors Allow
- changing perspective/illumination/scale
- scenarios with many occlusions
- different reasoning (descriptor vectors)
Require
- (Nearly) planarity across feature in 3d
- „detectable“ regions, enough structure
But
- fewer features / slower
- invariance vs. descriptiveness:
Descriptors can be similar when regions are not ! => What level of invariance / speed / properties do YOU really need ?
SLIDE 60 Feature tracking
- Identify features and track them
- ver video
- Small difference between frames
- potential large difference overall
- Standard approach:
KLT (Kanade-Lukas-Tomasi)
SLIDE 61
Tracking corners through video
SLIDE 62 Good features to track
- Use same window in feature selection as for
tracking itself
- Compute motion assuming it is small
Affine is also possible, but a bit harder (6x6 instead of 2x2)
differentiate:
SLIDE 63 Example
Simple displacement is sufficient between consecutive frames, but not to compare to reference template
SLIDE 64
Example
SLIDE 65 Good features to keep tracking
Perform affine alignment between first and last frame Stop tracking features with too large errors
SLIDE 66
- Brightness constancy assumption
Intensity Linearization
(small motion)
possibility for iterative refinement
SLIDE 67
- Brightness constancy assumption
Intensity Linearization
(small motion)
(2 unknowns) (1 constraint) ? isophote I(t)=I isophote I(t+1)=I the “aperture” problem
SLIDE 68 Intensity Linearization
- How to deal with aperture problem?
Assume neighbors have same displacement
(3 constraints if color gradients are different)
SLIDE 69 Lucas-Kanade
Assume neighbors have same displacement least-squares:
SLIDE 70 Revisiting the small motion assumption
- Is this motion small enough?
- Probably not—it‟s much larger than one pixel (1st
- rder Taylor not sufficient)
- How might we solve this problem?
* From Khurram Hassan-Shafique CAP5415 Computer Vision 2003
SLIDE 71 Reduce the resolution!
* From Khurram Hassan-Shafique CAP5415 Computer Vision 2003
SLIDE 72 image It-1 image I Gaussian pyramid of image It-1 Gaussian pyramid of image I image I image It-1 u=10 pixels u=5 pixels u=2.5 pixels u=1.25 pixels
Coarse-to-fine optical flow estimation
slides from Bradsky and Thrun
SLIDE 73 image I image J Gaussian pyramid of image It-1 Gaussian pyramid of image I image I image It-1
Coarse-to-fine optical flow estimation
run iterative L-K run iterative L-K warp & upsample
. . .
slides from Bradsky and Thrun
SLIDE 74
Next week: Project Proposals