Learning Covariant Feature Detectors
Karel Lenc and Andrea Vedaldi University of Oxford GMDL Workshop, ECCV 2016
Learning Covariant Feature Detectors Karel Lenc and Andrea Vedaldi - - PowerPoint PPT Presentation
Learning Covariant Feature Detectors Karel Lenc and Andrea Vedaldi University of Oxford GMDL Workshop, ECCV 2016 Local Feature Detection for Image Matching 2 (x) (x) x x ... ... The challenges of learning feature detectors
Karel Lenc and Andrea Vedaldi University of Oxford GMDL Workshop, ECCV 2016
2
(x)
...
x (x’)
...
x’
Detector goals select a small number of features → learn what to detect consistently with a viewpoint change → learn covariance
3
selection covariance
[A. Zisserman]
Prior work
Anchor Points defined a-priory:
IJCV 2009. (Haar Cascade, Hessian Detector).
approach to corner detection. TPAMI 2010 (Decision tree, Harris Detector). Detect points stable over time:
4
Extensive prior work
TPAMI 2014
convolutional neural networks. BMVC 2016
Tuning with Hard Examples. ECCV 2016 See Also the Local Features: State of the art, open problems and performance evaluation Workshop
5
Bypassing the feature selection problem
Advantages: feature selection is performed implicitly by the function Ψ the function Ψ can be implemented by a CNN this simplifies learning formulation
6
CNN Ψ T x T-1x Detection by regression: Design a function Ψ that translates the centre of an image window x on top of the nearest feature.
A convolutional approach
Apply the function Ψ at all locations, convolutionally Accumulate votes for the feature locations Perform non-maxima suppression in the vote map
7
input image estimated translations voted features
The CNN normalizes the patch by putting the feature in the center
8
CNN Ψ T x T-1x
The CNN normalises the patch by putting the feature in the centre
9
CNN Ψ T x T-1x
Two translated patches become the same after normalization
10
CNN Ψ T x T-1x CNN Ψ Tʹ xʹ Tʹ-1x identity map T-1 = -T I Tʹ T’ ○ I ○ -T
Two translated patches become the same after normalization
11
CNN Ψ T x T-1x CNN Ψ Tʹ xʹ Tʹ-1x identity map Tʹ - T
We do not know a-priori what the normalized patch looks like
12
CNN Ψ T x CNN Ψ Tʹ xʹ identity map Tʹ - T
But we may know the transformation Tgt between the patches
13
Tgt = Tʹ - T = Ψ(xʹ) - Ψ(x)
T Tʹ identity map Tgt x xʹ
Satisfied if the normalised patches are identical
14
Tgt = Tʹ - T = Ψ(xʹ) - Ψ(x) T Tʹ Tgt x xʹ Different
Satisfied if the normalised patches are identical
15
Tgt = Tʹ - T = Ψ(xʹ) - Ψ(x) T Tʹ Tgt x xʹ Same
Using synthetic translations of real patches
20
Tgt = Ψ(xʹ) - Ψ(x) ‖Tgt - Ψ(xʹ) + Ψ(x)‖2 ≅ 0 x1 Tgt,1 x1 = Tgt,1x1 ‖Tgt,1 - Ψ(Tgt,1x1) + Ψ(x1)‖2 Loss terms x2 Tgt,2 x1 = Tgt,1x1 ‖Tgt,2 - Ψ(Tgt,2x2) + Ψ(x2)‖2 x3 Tgt,3 x3 = Tgt,3x3 ‖Tgt,3 - Ψ(Tgt,3x3) + Ψ(x3)‖2 … … …
Features are oriented ellipses and transformations are affinities.
21
x Ggt = Ψ(xʹ) ○ Ψ(x)-1 identity map Ψ(xʹ) ○ Ψ(x) -1 1 Ggt Ψ(x) = (A,T) Ψ(xʹ) = (Aʹ,Tʹ)
1 1
By choosing different transformation groups, we obtain different detector types:
22
Covariance group Residual group Feature Example Translation - Point FAST Similarity - Pointed disk DoG + Orientation detection Affine - Pointed ellipse Harris Affine + Orientation detection Rotation - Line bundle Orientation detection Euclidean Rotation Point Harris Translation 1D translation Line Edge detector Simlarity Rotation Disk SIFT Affine Rotation Ellipse Harris Affine
See the paper for a full theoretical characterisation.
23
Ψ(xʹ) = (sʹ,Tʹ) residual Ψ(xʹ) ○ Q ○ Ψ(x)-1 Q Tgt = (sgt, Rgt, Tgt) Ψ(x) = (s,T) DoG is covariant with scale, rotation, translation, but fixes only scale and translation. Solution: replace identity with a residual transformation Q supplanting the missing information.
By choosing different transformation groups, we obtain different detector types:
Covariance group Residual group Feature Example Translation - Point FAST Similarity - Pointed disk DoG + Orientation detection Affine - Pointed ellipse Harris Affine + Orientation detection Rotation - Line bundle Orientation detection Euclidean Rotation Point Harris Translation 1D translation Line Edge detector Simlarity Rotation Disk DoG Affine Rotation Ellipse Harris Affine
24
DeteNet: A simple Siamese CNN architecture with six layers to estimate T Training uses synthetic patch pairs generated from ILSVRC12 train Uses Laplace operator to sample salient image patches of size 28 x 28
25
Conv 5 ⨉ 5 ⨉ 40 Pool ↓2 Conv 5 ⨉ 5 ⨉ 100 Pool ↓2 Conv 4 ⨉ 4 ⨉ 300 Conv 1 ⨉ 1 ⨉ 500 Conv 1 ⨉ 1 ⨉ 500 Conv 1 ⨉ 1 ⨉ 2
Ψ(x) x
Conv 5 ⨉ 5 ⨉ 40 Pool ↓2 Conv 5 ⨉ 5 ⨉ 100 Pool ↓2 Conv 4 ⨉ 4 ⨉ 300 Conv 1 ⨉ 1 ⨉ 500 Conv 1 ⨉ 1 ⨉ 500 Conv 1 ⨉ 1 ⨉ 2
x’ Ψ(x')
‖Tgt - Ψ(xʹ) + Ψ(x)‖2 Tgt
Loss Shared Weights
26
27
Repeatability on DTU
37
Repeatability on DTU
38
Repeatability on DTU
39
L2 Norm places the regressed location on an unit circle
40
Conv 5 ⨉ 5 ⨉ 40 Pool ↓2 Conv 5 ⨉ 5 ⨉ 100 Pool ↓2 Conv 4 ⨉ 4 ⨉ 300 Conv 1 ⨉ 1 ⨉ 500 Conv 1 ⨉ 1 ⨉ 500 Conv 1 ⨉ 1 ⨉ 2
Ψ(x) x
Conv 5 ⨉ 5 ⨉ 40 Pool ↓2 Conv 5 ⨉ 5 ⨉ 100 Pool ↓2 Conv 4 ⨉ 4 ⨉ 300 Conv 1 ⨉ 1 ⨉ 500 Conv 1 ⨉ 1 ⨉ 500 Conv 1 ⨉ 1 ⨉ 2
x’ Ψ(x')
‖Rgt - Ψ(xʹ) + Ψ(x)‖2 (cos(𝜄gt), sin(𝜄gt))
Loss Shared Weights
L2 Norm L2 Norm
41
42
Detection by regression bypasses the feature selection problem formulated as a learnable convolutional regressor Covariance constraint covariance is all you need to learn features naturally integrates with detection by regression a very general formulation encompassing all detector types bonus: a nice theory of features Results very good detectors can be learned however, not yet dramatically better than existing ones This is just the beginning a whole new approach many details still need ironing out DepDet Models available at: https://github.com/lenck/ddet
44
Repeatability on VGG-Affine
45
Angle error and matching score VGG-Affine
46