Local Feature Extraction and Learning for Computer Vision Bin Fan, - PowerPoint PPT Presentation

CVPR’2017 Tutorials Local Feature Extraction and Learning for Computer Vision Bin Fan, Chinese Academy of Sciences, China Jiwen Lu, Tsinghua University, China Pascal Fua, EPFL, Switzerland

Local Image Descriptors: A Tool for Matching “Things” Which pixel goes where?

Local Image Descriptors: A Tool for Matching “Things” Which region goes where?

By matching “things” we can… Content-based web image search Dense city 3D reconstruction/ Structure from motion

By matching “things” we can… … track objects in real-time even when there are occlusions and motion blur.

By matching “things” we can… Mobile augmented reality Real-time pedestrian detection

By matching “things” we can… Database … … detect objects in crowded scenes.

By matching “things” we can… … mosaic images into panoramas.

Local Image Descriptors: A Tool for Matching “Things” Which region goes where?

Local Image Descriptors Robustness Distinctiveness

Local Descriptor Trends CNN based methods Learning based methods Binary descriptor SIFT and its variants Early methods 04 07 10 15

Deep Learning Revolution

A Deep Casualty? • The SIFT paper is the most cited computer vision paper ever . • But it’s not as dominant as it once was.  Will it endure? Yes!

Keypoints Remain Relevant • When accurate geometric recovery matters, they remain unequaled. • They are efficient for real-time applications. • They provide an effective way • to compress the information present in large images, • to recognize specific locations. • The algorithms do not need to be retrained for each new application. • Some or all elements of the pipeline can deeply reformulated.  Future algorithms will combine Deep Learning and keypoint matching.

Outline of the Tutorial • Classic Local Features • Towards High Performance Descriptors (Floating Point) • Handcrafted Descriptors • Learned Descriptors • Towards Efficient Descriptors (Binary) • Handcrafted Descriptors • Learned Descriptors • Applications

Classic Local Features • SIFT: Scale Invariant Feature Transform • SURF: Speeded Up Robust Features • Daisy

SIFT Pipeline

SIFT [Lowe’99] Classic Local Features Scale space detection GSS: Gaussian Scale Space is produced by iteratively convolving the last layer image with a Gaussian kernel. DoGSS: DoG Scale Space is produced by subtracting neighboring GSS layers. * By Cmglee - Own work, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=42549151

SIFT [Lowe’99] Classic Local Features Non-max suppression in Scale Space Search extrema in DoGSS to locate initial keypoints.

SIFT [Lowe’99] Classic Local Features Keypoint Refinement • Refine : • Fit a 3D (x,y,scale) curve to the initial keypoint, and find the peak in the curve as the refined keypoint. • Elimination : • Discard keypoints with low refined DoG response. • Discard keypoints with high edge response.

SIFT [Lowe’99] Classic Local Features Dominant Orientation Estimation Gradient and angle: Orientation selection

SIFT [Lowe’99] Classic Local Features Descriptor construction

SIFT [Lowe’99] Classic Local Features Descriptor construction 1. Find the blurred image of the closest scale in scale space 2. Sample the points around the keypoint 3. Rotate the gradients and coordinates by dominant orientation 4. Separate the region into subregions 5. Create histogram for each sub region with 8 bins 6. Normalization

SURF [Bay ’04] Classic Local Features Speeded Up Robust Features • Aim: faster than SIFT, while still being robust . • 3-7 times faster than SIFT, with similar matching performance. • Key Idea: Haar filters and Integral Image • Well received! • More than 8000 citations. • CVIU Most Cited Paper • Koenderink Prize of ECCV’16 • fundamental contributions in computer vision that stood the test of time

SURF [Bay ’04] Classic Local Features Keypoint Detection • Uses determinant of Hessian matrix • Approximate 2nd derivatives in Hessian matrix with box filters

SURF [Bay ’04] Classic Local Features Keypoint Detection Lxx Lyy Lxy Dxx Dyy Dxy

SURF [Bay ’04] Classic Local Features SURF vs SIFT: Scale Space Increase Filter Size Fix Filter Size Scale SURF SIFT Fix Image Size Decrease Image Size

SURF [Bay ’04] Classic Local Features Dominant Orientation Estimation x response y response interest point dx scale = s r = 6s dy • The Haar wavelet responses (x and y) are represented as vectors. • Sum all responses within a sliding orientation window covering an angle of 60 degree. • The longest vector is the dominant orientation

SURF [Bay ’04] Classic Local Features Descriptor Extraction

SURF [Bay ’04] Classic Local Features Descriptor Extraction 1. Split the interest region (20s x 20s) into 4 x 4 square sub-regions. 2. Calculate Haar wavelet responses dx and dy, and weight the responses with a Gaussian kernel. 3. Sum the response over each sub-region for dx and dy, then sum the absolute value of response. 4. Concatenate summation results in all sub-regions, forming a 64D SURF descriptor.

Daisy [Tola ’08] Classic Local Features DAISY Descriptor • Log-polar grid arrangement • Gaussian pooling of histograms of gradient orientations • Efficient for dense computation, but not for sparse keypoints!

Daisy [Tola ’08] Classic Local Features Efficient Dense Computation of Features • The computation mostly involves 1D convolutions. • Rotating the descriptor only involves reordering the histograms.

Local Feature Extraction and Learning for Computer Vision Bin Fan, - PowerPoint PPT Presentation

CVPR2017 Tutorials Local Feature Extraction and Learning for Computer Vision Bin Fan, Chinese Academy of Sciences, China Jiwen Lu, Tsinghua University, China Pascal Fua, EPFL, Switzerland Local Image Descriptors: A Tool for Matching

Decision Tree Prof. Seungchul Lee Industrial AI Lab. Feature Test Feature 1 Feature 2 Feature

A Distinctive Feature of A Distinctive Feature of A Distinctive Feature of A Distinctive Feature

Outline Reducing Dimensionality Feature Selection 1 Steven J Zeil Feature Extraction 2

Earth: The Feature Presentation - feature, landscape, topography Earth: The Feature Presentation

Reducing Dimensionality Steven J Zeil Old Dominion Univ. Fall 2010 1 Feature Selection

Integrating Local Feature Detectors in the Integrating Local Feature Detectors in the Interactive

Feature Extraction 7-1 Ronald Peikert SciVis 2007 - Feature Extraction What are features?

Feature Structures, Unification Some grammatical phenomena Linguistic features Feature

Feature Point Feature-based approach: Detect and match feature Detec.on and Matching points

Week 3 Video 4 Automated Feature Generation Automated Feature Selection Automated Feature

CS4495/6495 Introduction to Computer Vision 4B-L2 Matching feature points (a little) Feature

Using Data Fusion and Web Mining to Support Feature Location in Software SEMERU Feature: a

Feature Extraction 7-1 Ronald Peikert SciVis 2008 - Feature Extraction What are features?

FEATURE AND LOCAL FEATURE DESCRIPTORS FOR SILK FABRIC PATTERN IMAGE RECOGNITION Thananchai

Local Feature Extraction and Learning for Computer Vision Part 3: Binary Feature Learning for

Usage Scenarios for a Common Feature Modeling Language Thorsten Berger and Philippe Collet

Feature Detection and Matching Shao-Yi Chien Department of Electrical Engineering

Review: Matt Brown s Canonical Frames 4/15/2011 2 Multi-Scale Oriented Patches Extract

Object Recognition using Invariant Local Features Goal: Identify known objects in new images

O b j e c t R e c o g n i t i o n S I F T v s C o n v o l u t i o

Efficient visual search of local features Cordelia Schmid Bag-of-features

Matching and Image Alignment Computer Vision Fall 2018 Columbia University Feature Matching

Boosted Cascade of Simple Features Paul Viola and Michael Jones CVPR 2001 Brendan Morris

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications Part II: