Learning Covariant Feature Detectors Karel Lenc and Andrea Vedaldi - - PowerPoint PPT Presentation

learning covariant
SMART_READER_LITE
LIVE PREVIEW

Learning Covariant Feature Detectors Karel Lenc and Andrea Vedaldi - - PowerPoint PPT Presentation

Learning Covariant Feature Detectors Karel Lenc and Andrea Vedaldi University of Oxford GMDL Workshop, ECCV 2016 Local Feature Detection for Image Matching 2 (x) (x) x x ... ... The challenges of learning feature detectors


slide-1
SLIDE 1

Learning Covariant Feature Detectors

Karel Lenc and Andrea Vedaldi University of Oxford GMDL Workshop, ECCV 2016

slide-2
SLIDE 2

Local Feature Detection for Image Matching

2

(x)

...

x (x’)

...

x’

slide-3
SLIDE 3

The challenges of learning feature detectors

Detector goals select a small number of features → learn what to detect consistently with a viewpoint change → learn covariance

3

selection covariance

[A. Zisserman]

slide-4
SLIDE 4

Prior work

Learning detectors

Anchor Points defined a-priory:

  • J. Sochman and J. Matas: Learning fast emulators of binary decision processes.

IJCV 2009. (Haar Cascade, Hessian Detector).

  • E. Rosten, R. Porter, and T. Drummond: Faster and better: a machine learning

approach to corner detection. TPAMI 2010 (Decision tree, Harris Detector). Detect points stable over time:

  • Y. Verdie et al: TILDE: A Temporally Invariant Learned DEtector, CVPR 2015.

4

slide-5
SLIDE 5

Extensive prior work

Learning descriptors

  • S. Winder et al: Picking the best DAISY. CVPR 2009
  • K. Simonyan et al: Learning local feature descriptors using convex optimization.

TPAMI 2014

  • S. Zagoruyko et al: Learning to Compare Image Patches via Convolutional Neural
  • Networks. CVPR 2015
  • M. Paulin et al: Local Convolutional Features with Unsupervised Training for Image
  • Retrieval. ICCV 2015
  • E. Simo-Serra et al: Discriminative learning of deep convolutional feature point
  • descriptors. ICCV 2015
  • V. Balntas et al: Learning local feature descriptors with triplets and shallow

convolutional neural networks. BMVC 2016

  • F. Radenovic et al. CNN Image Retrieval Learns from BoW: Unsupervised Fine-

Tuning with Hard Examples. ECCV 2016 See Also the Local Features: State of the art, open problems and performance evaluation Workshop

5

Detector ≠ descriptor

slide-6
SLIDE 6

Bypassing the feature selection problem

Detection by regression

Advantages: feature selection is performed implicitly by the function Ψ the function Ψ can be implemented by a CNN this simplifies learning formulation

6

CNN Ψ T x T-1x Detection by regression: Design a function Ψ that translates the centre of an image window x on top of the nearest feature.

slide-7
SLIDE 7

A convolutional approach

From regression to detection

Apply the function Ψ at all locations, convolutionally Accumulate votes for the feature locations Perform non-maxima suppression in the vote map

7

input image estimated translations voted features

slide-8
SLIDE 8

The CNN normalizes the patch by putting the feature in the center

Covariance constraint

8

CNN Ψ T x T-1x

slide-9
SLIDE 9

The CNN normalises the patch by putting the feature in the centre

Covariance constraint

9

CNN Ψ T x T-1x

slide-10
SLIDE 10

Two translated patches become the same after normalization

Covariance constraint

10

CNN Ψ T x T-1x CNN Ψ Tʹ xʹ Tʹ-1x identity map T-1 = -T I Tʹ T’ ○ I ○ -T

slide-11
SLIDE 11

Two translated patches become the same after normalization

Covariance constraint

11

CNN Ψ T x T-1x CNN Ψ Tʹ xʹ Tʹ-1x identity map Tʹ - T

slide-12
SLIDE 12

We do not know a-priori what the normalized patch looks like

? ?

Covariance constraint

12

CNN Ψ T x CNN Ψ Tʹ xʹ identity map Tʹ - T

slide-13
SLIDE 13

But we may know the transformation Tgt between the patches

Covariance constraint

13

Tgt = Tʹ - T = Ψ(xʹ) - Ψ(x)

? ?

T Tʹ identity map Tgt x xʹ

slide-14
SLIDE 14

Satisfied if the normalised patches are identical

Covariance constraint

14

Tgt = Tʹ - T = Ψ(xʹ) - Ψ(x) T Tʹ Tgt x xʹ Different

slide-15
SLIDE 15

Satisfied if the normalised patches are identical

Covariance constraint

15

Tgt = Tʹ - T = Ψ(xʹ) - Ψ(x) T Tʹ Tgt x xʹ Same

slide-16
SLIDE 16

Using synthetic translations of real patches

Learning formulation

20

Tgt = Ψ(xʹ) - Ψ(x) ‖Tgt - Ψ(xʹ) + Ψ(x)‖2 ≅ 0 x1 Tgt,1 x1 = Tgt,1x1 ‖Tgt,1 - Ψ(Tgt,1x1) + Ψ(x1)‖2 Loss terms x2 Tgt,2 x1 = Tgt,1x1 ‖Tgt,2 - Ψ(Tgt,2x2) + Ψ(x2)‖2 x3 Tgt,3 x3 = Tgt,3x3 ‖Tgt,3 - Ψ(Tgt,3x3) + Ψ(x3)‖2 … … …

slide-17
SLIDE 17

Generalisation: affine covariant detectors

Features are oriented ellipses and transformations are affinities.

21

x Ggt = Ψ(xʹ) ○ Ψ(x)-1 identity map Ψ(xʹ) ○ Ψ(x) -1 1 Ggt Ψ(x) = (A,T) Ψ(xʹ) = (Aʹ,Tʹ)

1 1

slide-18
SLIDE 18

A very general framework

By choosing different transformation groups, we obtain different detector types:

22

Covariance group Residual group Feature Example Translation - Point FAST Similarity - Pointed disk DoG + Orientation detection Affine - Pointed ellipse Harris Affine + Orientation detection Rotation - Line bundle Orientation detection Euclidean Rotation Point Harris Translation 1D translation Line Edge detector Simlarity Rotation Disk SIFT Affine Rotation Ellipse Harris Affine

slide-19
SLIDE 19

Residual transformations

See the paper for a full theoretical characterisation.

23

Ψ(xʹ) = (sʹ,Tʹ) residual Ψ(xʹ) ○ Q ○ Ψ(x)-1 Q Tgt = (sgt, Rgt, Tgt) Ψ(x) = (s,T) DoG is covariant with scale, rotation, translation, but fixes only scale and translation. Solution: replace identity with a residual transformation Q supplanting the missing information.

slide-20
SLIDE 20

By choosing different transformation groups, we obtain different detector types:

Covariance group Residual group Feature Example Translation - Point FAST Similarity - Pointed disk DoG + Orientation detection Affine - Pointed ellipse Harris Affine + Orientation detection Rotation - Line bundle Orientation detection Euclidean Rotation Point Harris Translation 1D translation Line Edge detector Simlarity Rotation Disk DoG Affine Rotation Ellipse Harris Affine

A very general framework

24

slide-21
SLIDE 21

An example CNN architecture

DeteNet: A simple Siamese CNN architecture with six layers to estimate T Training uses synthetic patch pairs generated from ILSVRC12 train Uses Laplace operator to sample salient image patches of size 28 x 28

25

Conv 5 ⨉ 5 ⨉ 40 Pool ↓2 Conv 5 ⨉ 5 ⨉ 100 Pool ↓2 Conv 4 ⨉ 4 ⨉ 300 Conv 1 ⨉ 1 ⨉ 500 Conv 1 ⨉ 1 ⨉ 500 Conv 1 ⨉ 1 ⨉ 2

Ψ(x) x

Conv 5 ⨉ 5 ⨉ 40 Pool ↓2 Conv 5 ⨉ 5 ⨉ 100 Pool ↓2 Conv 4 ⨉ 4 ⨉ 300 Conv 1 ⨉ 1 ⨉ 500 Conv 1 ⨉ 1 ⨉ 500 Conv 1 ⨉ 1 ⨉ 2

x’ Ψ(x')

‖Tgt - Ψ(xʹ) + Ψ(x)‖2 Tgt

Loss Shared Weights

slide-22
SLIDE 22

DetNet Easy Test

26

slide-23
SLIDE 23

DetNet Hard Test

27

slide-24
SLIDE 24

Detection

slide-25
SLIDE 25

Detection

slide-26
SLIDE 26

Detection

slide-27
SLIDE 27

Detection

slide-28
SLIDE 28

DoG

slide-29
SLIDE 29

Harris

slide-30
SLIDE 30

Detection

slide-31
SLIDE 31

Repeatability on DTU

DetNet results

37

slide-32
SLIDE 32

Repeatability on DTU

DetNet results

38

slide-33
SLIDE 33

Repeatability on DTU

DetNet results

39

slide-34
SLIDE 34

RotNet Architecture

L2 Norm places the regressed location on an unit circle

40

Conv 5 ⨉ 5 ⨉ 40 Pool ↓2 Conv 5 ⨉ 5 ⨉ 100 Pool ↓2 Conv 4 ⨉ 4 ⨉ 300 Conv 1 ⨉ 1 ⨉ 500 Conv 1 ⨉ 1 ⨉ 500 Conv 1 ⨉ 1 ⨉ 2

Ψ(x) x

Conv 5 ⨉ 5 ⨉ 40 Pool ↓2 Conv 5 ⨉ 5 ⨉ 100 Pool ↓2 Conv 4 ⨉ 4 ⨉ 300 Conv 1 ⨉ 1 ⨉ 500 Conv 1 ⨉ 1 ⨉ 500 Conv 1 ⨉ 1 ⨉ 2

x’ Ψ(x')

‖Rgt - Ψ(xʹ) + Ψ(x)‖2 (cos(𝜄gt), sin(𝜄gt))

Loss Shared Weights

L2 Norm L2 Norm

slide-35
SLIDE 35

RotNet Easy Test

41

slide-36
SLIDE 36

RotNet Hard Test

42

slide-37
SLIDE 37

Summary

Detection by regression bypasses the feature selection problem formulated as a learnable convolutional regressor Covariance constraint covariance is all you need to learn features naturally integrates with detection by regression a very general formulation encompassing all detector types bonus: a nice theory of features Results very good detectors can be learned however, not yet dramatically better than existing ones This is just the beginning a whole new approach many details still need ironing out DepDet Models available at: https://github.com/lenck/ddet

44

slide-38
SLIDE 38

Repeatability on VGG-Affine

DetNet results

45

slide-39
SLIDE 39

Angle error and matching score VGG-Affine

RotNet example patch pairs

46